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Editorial 

New Co-Editor-in-Chief 

Manuscripts must now be submitted via the Internet 
Single topic issues and volumes 


New Co-Editor-in-Chief 

We are very pleased to announce that William (Bill) Frank¬ 
lin, PhD, has become our second Co-Editor-in-Chief. 

Bill got his PhD in Biophysics at Harvard University. His 
primary interest is in the area of DNA damage and repair and 
he has published some 84 original articles, book chapters and 
abstracts. Bill has a thorough knowledge of molecular biology 
and genetics and a very good knowledge of biology in general. 
We are very pleased that Bill has joined the Editorial Board of 
Cytogenetic and Genome Research. His help is essential now 
that the amount of work has increased considerably as a result 
of the many single topic volumes we are currently publishing 
and will continue to publish in the coming years (see below). 

Manuscripts must now be submitted via the Internet 

Authors are now required to submit their manuscripts via a 
simple and efficient Manuscript Processing System we have 
developed. This can be accessed at: http://cytserver.aecom.yu. 
edu/ccg/submit/. Successful submission is confirmed imme¬ 
diately by the system and a tracking number is issued for use in 
the event of a mishap. Authors will also receive a second confir¬ 
mation via e-mail within two working days. This message will 
inform them of their manuscript number (not the same as the 
tracking number) and a password that will allow them to access 
their folder in the CGR Manuscript Database. They will also be 
informed of how their manuscript will be processed, and in 
those cases where the manuscript will be processed by an Exe¬ 
cutive or Associate Editor, the message will contain the Editor’s 
name and contact information. All correspondence relating to 
their paper should then be addressed to that Editor via the web 
submission system. Copies of all e-mail messages from the Edi¬ 


torial Office, from authors, or from members of the Editorial 
Board will also be put into the manuscript folder. 

Therefore, the manuscript folder, accessed by the password, 
will serve throughout the entire processing period and will con¬ 
tain all the information related to that paper. Authors will be 
informed via e-mail when the review process has been com¬ 
pleted so that they can examine their folder that will now con¬ 
tain the decision as to the acceptance of the paper for publica¬ 
tion, recommendations of the reviewers, indications of any 
revisions that may be required and any other communications 
from the Editors. 

If revisions are required, the authors will then upload their 
revised paper to the same folder. When the copy editing of the 
manuscript is completed the authors will be instructed to 
download their electronic “galley proof’, make any corrections 
that are necessary, and upload it again to their folder. The fold¬ 
er will also contain a reprint order form customized to the par¬ 
ticular paper indicating reprint costs and any costs for color 
plates or extra pages. 

We are confident that this system will accelerate the entire 
manuscript processing procedure and will simplify all stages of 
the manuscript publication procedure. 

Single topic issues and volumes 

We are very pleased to inform our readers about a series of 
single topic volumes we are now publishing. For these volumes 
we are soliciting the assistance of one or more expert investiga¬ 
tors to act as Guest Editors in an area that is particularly inter¬ 
esting and active and/or one in which there is a need for a thor¬ 
ough overview. The Guest Editors then invite the top re¬ 
searchers in the area to contribute original research reports or 
reviews of a topic that is in their main area of interest. These 
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Published and to be published single topic issues and volumes 


A. Volumes that have been published or are in press 


Title of single topic issue or 
volume 

Guest editor(s) 

Total no. of papers 
and pages 

Volume, issue no. 
and year 

Vertebrate Sex Chromosomes 

Nobuo Takagi, 

Sapporo, Japan 

46-352 

99: 1-4 (2002) 

Nucleotide and Protein 

Expansions and Human Disease 

Josef Gecz, and Grant R. 

Sutherland, Adelaide, Australia 

29-298 

100:1-4(2003) 

Third International Symposium on 
Vertebrate Sex Determination 

Valentine Lance, 

San Diego, CA, USA 

21-148 

101:3-4(2003) 

This double issue 

Animal Genomics 

Bhanu Chowdhary 

College Station, TX, USA 

60 - approx. 350 

102:1-4 (2003) 

(In Press) 

Molecular Aspects of Mouse 
Spermatogenesis 

Ricardo Benavente 

Wurzburg, Germany 

18 - approx. 140 

103:3-4 (2003) 

(In Press) 


B. Issues and volumes to be published in the next few years (not necessarily in order shown) 


Title of single topic issue or volume 

Guest editor(s) 

Anticipated year of 
publication 

Chromosome Aberrations 

Gunter Obe, Essen, Germany and 
Adajapalam T. Natarajan, Leiden, 

The Netherlands 

104:1-4 (2004) 

Primate Cytogenetics, Genome Organization and 
Evolution 

Steffan Muller 

Miinchen, Germany 

2004 

Retrotransposable Elements and Genome Evolution 

Jean-Nicolas Volff, 

Wurzburg, Germany 

2004 

B Chromosomes in the Eukaryote Genome 

Juan Pedro Camacho, 

Granada, Spain 

2004 

Repair Proteins in Meiosis 

Paula Cohen, 

New York, NY, USA 

2004 

XV International Chromosome Conference 

Darren Griffin and Joanna Bridger 

London, UK 

2004 

Mouse Genetics after the Mouse Genome 

Silvia Garagna, 

Pavia, Italy 

2005 

Plant Cytogenetics 

Maria Puertas and Tomas Naranjo 

Madrid, Spain 

2005 

Hereditary Nephropathies 

Klaus Zerres, 

Bonn, Germany 

2006 

Chorea Huntington 

Jorg T. Epplen, 

Bochum, Germany 

2006 


papers are peer reviewed the same way as other papers received 
by this Journal. 

In the above table we list those single topic issues and vol¬ 
umes that have appeared, as well as those scheduled for the 
next few years. We are very pleased with the very high scientific 
level of the volumes that have appeared and the papers we have 
received for forthcoming publications. We are confident many 
of our readers will find these publications very informative and 
an excellent source of reference for quite some time. 


These Single Topic issues and volumes are included in the 
subscription to Cytogenetic and Genome Research. Hard or soft 
cover versions can also be purchased separately from the Pub¬ 
lisher. 

We very much welcome suggestions any of our readers may 
have for other such single topic volumes. 

Harold P. Klinger 
Michael Schmid 
November, 2003 
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Preface 


The early 1980s marked the beginning of a new era in ani¬ 
mal genetics - an era to which most of us can directly relate. 
Riding on the excitement surrounding the availability of re¬ 
combinant DNA technology and the development of the hu¬ 
man gene map, animal geneticists around the world started tak¬ 
ing subtle but decisive steps toward the development of gene 
maps in domesticated animals. Identification of syntenic and 
linkage groups in cattle, pig and horse were among the early key 
developments that provided stimulus and laid the foundations 
for future gene maps. Almost 20 years have passed since these 
initial steps were taken. During this period animal genomics has 
evolved in an unprecedented manner and is progressing in the 
true spirit of a “revolution” - the “genomics revolution”. 

Organized genome analysis projects in cattle and pigs were 
initiated during the late 1980s. Since then, the number of dom¬ 
esticated species being analyzed has tremendously expanded. 
Today, genome programs are in place for almost all important 
domesticated animals. The objectives are simple and perhaps 
age-old: to improve production, reproduction, disease resis¬ 
tance, diagnostics and health care. However, the tools to reach 
these goals and the expected outcome have changed. Exciting, 
yet unbelievable, is the fact that for some species like chicken, 
cattle and dog, whole genome sequencing is either in progress 


or is close to finishing. For others like pig, cat etc., blue prints 
for sequencing are in place. Five years ago, all this was unimag¬ 
inable. No doubt, these are exciting times in animal genomics. 

In the fall of 2001, the editorial board meeting of Cytogenet¬ 
ic and Genome Research discussed possibilities of several new 
issues of the journal that could focus on emerging areas in 
genome research. I was thrilled that animal genomics was recog¬ 
nized as one of the rapidly evolving research areas. However, I 
was not so thrilled (let me be honest!) when the burden of all 
this fell on my shoulders - not because this was a challenging 
task, but because there are stalwarts who could do it better. 
With no escape, I have simply acted as a junction-point to 
channel your research and thoughts. 

The joy for me in preparing this volume has been the incre¬ 
dible and relentless enthusiasm from the contributors, remark¬ 
able turn-around time by the reviewers and constant encour¬ 
agement from Michael Schmid, the chief editor. Together, we 
all have a wonderful tribute to 20 years of animal genome 
research. 

Enjoy reading it! 

Bhanu Chowdhary 

College Station, November 2003 
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From biochemical genetics to DNA sequencing 
and beyond: The changing face of animal 
genomics 

J. Womack 

Department of Veterinary Pathobiology, College of Veterinary Medicine, Texas A&M University, College Station TX (USA) 


Animal genomics has made remarkable progress within a 
very short span of time. It is hard to believe that within a 20- 
year period, gene maps of livestock and companion species 
have transformed from nascent entities to highly developed 
and informative units. Having witnessed it from the beginning, 
I will try to give you a brief chronology of the developments. 

The predecessors of animal genomics struggled to find link¬ 
age associations of observable phenotypes with immunological 
and biochemical markers available to them in the middle of the 
last century. It is not surprising that early successes in animal 
gene mapping were in the discovery of X linkages revealed by 
dam to son inheritance. Autosomal linkages were hard to come 
by with the total number of polymorphic markers in most ani¬ 
mal species less than a few dozen, mostly blood antigens and 
biochemical polymorphisms revealed by electrophoresis. Find¬ 
ing an autosomal linkage generally highlighted a successful car¬ 
eer in animal genetics research through the 1960’s and into the 
70’s. 

Somatic cell genetics provided the first “whole genome” 
approach to the assignment of genes to chromosomes or synte- 
ny groups in domestic animals. Following the lead of human 
genetics in the 1970’s, panels of hybrid somatic cells segregating 
cat and cattle chromosomes were developed in the early 1980’s. 
While these panels were initially characterized with only bio¬ 
chemical markers, they generated syntenic groups (genes on the 
same chromosome) which were subsequently assigned to spe¬ 
cific chromosomes. These maps, called synteny maps to avoid 
confusion with linkage maps, became the basis for comparative 
gene mapping between species. Unfortunately, the term synte¬ 
ny is now almost universally, but very inappropriately, used in 


Request reprints from Jim Womack, Department of Veterinary Pathobiology 
College of Veterinary Medicine, Texas A&M University 
College Station TX-77843 (USA); telephone: 1-979-845-9810 
fax: 1-979-845-9972; e-mail: Jwomack@cvm.tamu.edu 


place of “conserved synteny” to define segments of homology 
between species. The introduction of cloned molecular probes 
and Southern blotting greatly magnified the power of somatic 
cell genetics and it became the technique of choice to rapidly 
develop whole genome maps of a variety of other species in the 
late 80’s and early 90’s, thus the birth of animal genomics. 

Whole genome linkage maps of domestic animals emerged 
in the early 1990’s, thanks largely to the discovery of microsa¬ 
tellite markers and the collection of DNA from two and three 
generation families for distribution to communities of collabo¬ 
rating scientists. These maps, several quickly developed to 3- 
5 cM level of resolution, became the backbone for genome 
scanning for genes responsible for visible traits, inherited dis¬ 
eases and ultimately for quantitative trait loci (QTL) of eco¬ 
nomic importance. 

Cytogenetics flourished with the availability of molecular 
probes in the 80’s, leading to the assignment of genes to chro¬ 
mosomes (beginning with the MHC in most animals). The evo¬ 
lution of isotopic in situ hybridization (IISH) into fluorescence 
in situ hybridization (FISH) generated a technique of great val¬ 
ue in anchoring somatic cell, linkage, and radiation hybrid 
(RH) maps to chromosomes. The isolation of purified individ¬ 
ual chromosomes from one species (first from human) for 
whole chromosome painting by Zoo-FISH has become a pow¬ 
erful technique for chromosome level comparisons of the 
genomes of different animals. Fiber-FISH is now being em¬ 
ployed to order markers along extended DNA from domestic 
animal genomes. 

While valuable for comparative mapping at a gross chromo¬ 
some level, neither somatic cell genetics nor Zoo-FISH permit 
the comparison of marker order within conserved syntenic 
groups. Linkage maps, comprised primarily of non-conserved 
microsatellites, have limitations in comparative mapping as 
well. The advent of RH mapping provided a new spark of life to 
comparative genomics. ESTs from domestic animals with sig¬ 
nificant BLAST hits in genome databases of human and mice 
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enhance the value of RH maps for comparative mapping. Some 
animal RH maps now contain several thousand comparative 
markers, providing comparison of gene order between species 
at a 1-2 Mb level of resolution. 

Genomic maps entered the “application phase” with map¬ 
ping of traits, and particularly with the mapping of QTL begin¬ 
ning in the early to mid 1990’s, shortly after comprehensive 
linkage maps became available for genome scanning. The num¬ 
ber of traits, both simple and quantitative, now associated with 
chromosomes is several dozen in some animal species. In fact, 
we have generated a significant bottle-neck of large numbers of 
mapped traits awaiting the heretofore slow and tedious process 
of discovery of the genes underlying the phenotypic variation. 
Thanks to good comparative maps and BAC and YAC contigs 
spanning QTL, a handful of genes contributing to variation in 
interesting or economically important traits have been mined 
from genome scanning projects. Hopefully, the next phase of 
animal genomics will accelerate the process of gene discovery. 

The sequencing phase of animal genomics has begun! I can’t 
count the number of times I have said or written as justification 
for my own work in comparative mapping: “We will never have 
a complete sequence of the cattle genome.” I believed this as 
recently as two years ago. How exciting it is to note that a 1 x 
dog sequence is already reported, the chicken will soon be rea¬ 
dy, and the cow sequence will begin before the end of 2003. The 


pig and cat can’t be far behind and I can hear the horse coming 
up the backstretch. I will not declare sheep out of the picture, 
despite their close relationship to cattle. Neither will I declare 
comparative genomics obsolete. 

Structural and comparative genomics have served us well 
and will serve us even better in the sequencing era. It is certain¬ 
ly time to gear up for functional genomics in domestic animals. 
Expression arrays are currently available for a few domestic 
animal species. Moreover, bioinformatics tools are beginning 
to emerge to enhance comparative approaches to gene discove¬ 
ry and to define interactions of multiple loci. A vast array of 
phenotypic variability in our favorite species presents us with a 
scientific challenge worthy of our best efforts over the next 
decade. 

My thanks to Cytogenetic and Genome Research for celebrat¬ 
ing 20 years of animal genome research with this special issue 
which highlights the current status of our discipline. It is 
remarkable that we have come so far so quickly with the limited 
resources available to animal genomics. As an observer of the 
development of animal genomics over these two decades, I 
attribute our success largely to a spirit of cooperation among 
animal geneticists which I have not always seen in other scien¬ 
tific disciplines. I hope you will enjoy the excellent papers con¬ 
tributed to this issue and I hope you will keep alive the spirit of 
scientific cooperation that has led to our collective success. 
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How it all began vided a reliable linkage map as well as publicly available 

resources that catalyzed the world-wide revolution in trait map- 
Looking back 15 years, at a time when the International ping that has followed since the mid 1990s. The early applica- 

Society for Animal Genetics formed its first standing commit- tions of the cattle linkage map included mapping of QTL affect- 

tee to coordinate research in cattle, sheep and goat gene map- ing milk production and composition, somatic cell score, 

ping, sequencing the cattle genome was nothing more than a growth and carcass traits (Georges et al., 1995; Heyen et ah, 

fantasy even among the most influential animal geneticists. 1999; Keeleetal., 1999). In addition, the genes for double-mus- 

The unifying objective that brought the animal genetics com- cling (Muscular Hypertrophy), Weaver Disease (Progressive 

munity together in 1988 was to create linkage maps with poly- Degenerative Myeloencephalopathy) and horns in cattle were 

morphic markers covering all the chromosomes at a spacing mapped by linkage analysis primarily using anonymous mark- 

that would enable the mapping of loci affecting quantitative ers (Nicholas, 2003). In retrospect, the relentless collaborative 

traits (QTLs). The foundations for QTL mapping and imple- efforts by a relatively small international group of scientists 

mentation in breeding plans were laid with the theoretical work changed the entire course of animal genetics, simultaneously 

of Morris Soller and his colleagues (Beckmann and Soller, bringing the discipline into the mainstream of genome science. 

1983; Soller, 1990) and the experimental work of others, partic¬ 
ularly in the area of immunogenetics, which demonstrated that 

genes affecting resistance to infectious diseases (e.g., Xu et al., Somatic cell genetics revolutionizes cattle genomics 
1993) and production traits (e.g., Beever et al., 1990) could be 

identified by association. 1 The parallel development of synte- From the late 1980s to the mid 1990s, progress in gene map- 
ny-based comparative gene maps followed by the discovery of ping was painstakingly slow. The methods and approaches used 

highly polymorphic minisatellite and microsatellite markers in until that time included linkage analysis (primarily RFLPs, 

the cattle genome set the stage for the crash programs to create microsatellite and minisatellite markers), FISH mapping, and 

linkage maps that would facilitate the identification of QTLs. synteny mapping (reviewed by Fries et al., 1993; Womack and 

Four independent low-to-moderate resolution linkage maps Kata, 1995; Table 1). Synteny mapping using a somatic cell 

were eventually produced by 1996 (Barendse et al., 1994; (synteny) panel was by far the most efficient comparative map- 

Bishopetal., 1994; Georges etal., 1995; Maetal., 1996), result- ping method, resulting in >500 loci mapped. However, the 
ing in an explosion of mapped genes for monogenic and poly- method cannot be used to create an ordered map of loci, limit- 
genic traits. In particular, the program at USDA-MARC pro- ing its usefulness for candidate gene identification. 

The modern area of cattle genomics can be traced to the 
development of a cattle-hamster radiation hybrid panel (Wo- 

Supported in part by grants AG99-35205-8534, AG2002-35205-11625, AG2002- mack et al., 1997), a mapping approach that eliminated the 

34480-11828 and AG58-5438-2-3 13 from the United states Department of Agri- laborious process of producing an ordered gene map by identi- 
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Table 1 . Map status and genomics resources for cattle 


Chromosomes 


29 autosomes, X, Y 


Genome size (predicted) 
Mitochondrial genome 

3,000 Mbp 
16,338 bp 



Mapped markers 

Map type 

Total 

Type I 

Type II 

Synteny map 

1,800 

— 

— 

Genetic linkage map 

2,300 

-200 

2,100 

Cytogenetic map 

>300 

— 


RH map 

3,151 

1,654 

1,497 

Comparative map 

1,556 

1,556 

— 


Resources 
EST sequences 

In GenBank 
In progress 
cDNA libraries 
BAC libraries 


Fingerprinted BAC libraries 


>326,000 
unknown 
~ 76 libraries 

(1) Texas A&M (TAMU), TAMBT (4x) Angus Beef 

(2) Institut Nationale Researche Agronomique (INRA) (4x), Holstein Bull 

(3) Children’s Hospital of Oakland Research Institute: CHORI-240 (-1 lx) Hereford Bull 

(4) Children’s Hospital of Oakland Research Institute: RPCI-42 library (~12x) Holstein Bull 
four BAC libraries integrated into one database for sequence assembly 


Websites (Databases) 

http://www.thearkdb.org/browser?species=cow (Roslin Institute ArkDB) 

http://locus.jouy.inra.fr/cgi-bin/lgbc/mapping/common/intro2.pl?BASE=cattle (INRA, BovMap database) 

http://www.ncbi.nlm.nih.gov/genomes/framik.cgi?db=Genome&gi=T0415 (NCBI, Bos Taurus mitochondrion, complete genome) 
http://cagst.animal.uiuc.edu (Illinois Reference/Resource Families) 
http://titan.biotec.uiuc.edu/cattle/cattle_project.htm (Cattle EST project) 

http://www.marc.usda.gov/genome/genome.html (Meat Animal Research Center (MARC), USDA 
http://bos.cvm.tamu.edu/bovgbase.html 


fying polymorphic markers and scoring segregation of these 
markers in large families. The parallel progress in producing 
vast numbers of expressed sequence tags (ESTs) and methods 
for high throughput sequence similarity searches using the 
BLAST algorithm literally transformed cattle genomics over¬ 
night. An important conceptual breakthrough was that cattle 
ESTs provided a powerful resource for comparative mapping. 
The basic principle was demonstrated by Ma and coworkers 
(Ma et al., 1998) who showed that at least 75% of cattle ESTs 
contained enough sequence information to identify putative 
human orthologs using BLASTN, and that such ESTs mapped 
to the chromosomes expected on the basis of the then existing 
synteny-based comparative mapping information. Subsequent¬ 
ly, when cattle ESTs were mapped on the RH panel they pro¬ 
vided comparative anchor points to the human genome (Oza¬ 
wa et al., 2000; Band et al., 2000). Having a high density RH 
map also meant that any DNA sequence with significant simi¬ 
larity to the human genome could be mapped in the cattle 
genome in silico. The in silico approach developed to exploit 
the extensive conservation of genome organization in mam¬ 
mals (Ma et al., 1998; Rebeiz and Lewin, 2000), termed COM¬ 
PASS (comparative mapping by annotation and sequence simi¬ 
larity), greatly facilitated targeted RH mapping in gene-poor 
regions of the cattle map as well as the filling in of gaps in the 
cattle-human comparative map. The moderate resolution RH 
maps produced by COMPASS-RH mapping led to a greater 
understanding of the evolution of mammalian chromosomes 
(e.g., Band et al., 1998, 2000). Moreover, these maps provided 
a critical resource for the identification of trait genes using the 
comparative positional candidate gene approach, including the 


genes responsible for double-muscling (Grobet et al., 1997; 
Kambadur et al., 1997), chondrodysplasia (Takeda et al., 
2002), and a major QTL for milk production (Grisart et al., 
2002). The impressive experiments leading to the identifica¬ 
tion of specific mutations resulting in traits of importance to 
animal agriculture validated the investments made in funding 
comparative mapping research. 

The end-game in RH-based cattle comparative mapping has 
already begun, setting the stage for high resolution multi-spe¬ 
cies comparative analysis and providing the necessary re¬ 
sources for the correct ordering of the BAC contigs on the cattle 
chromosomes. Using BAC-end sequences (BESs) and the ex¬ 
tensive cattle EST resources, high-resolution RH maps were 
recently created that revealed intimate details of mammalian 
chromosome evolution (Larkin et al., 2003). The proposed 
~6x genome coverage (see below) will include thousands of 
gaps, resulting in unordered sequence contigs on the cattle 
genome map. The large number of sequence contigs expected 
cannot be ordered correctly on the cattle chromosomes unless 
they have reference coordinates on the cattle genome map. The 
1 Mb resolution cattle-human comparative map recently con¬ 
structed for HSA11 using cattle BESs (e.g., Larkin et al., 2003) 
is an excellent example of how BESs will provide the necessary 
resource for “sealing” the remaining gaps. Because the BESs 
used to make the map will be incorporated into contigs during 
the genome sequencing effort, the necessary linkages between 
the RH map and the sequence contigs will be made, thus facili¬ 
tating the whole genome sequence assembly process. A high 
resolution RH map will therefore improve the accuracy and 
ultimate utility of the whole genome sequence. Moreover, the 
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comparatively anchored whole-genome BAC contig will make 
it possible to select cattle BACs containing nearly every known 
gene, even prior to obtaining the whole genome sequence. 

The bovine genome sequencing initiative 

The year 2004 will witness the inauguration of the cattle 
genome-sequencing project, a two-year multinational effort 
that will be greatly enhanced by lessons learned from sequenc¬ 
ing the human, mouse and rat genomes (Gibbs et ah, 2002). 
The cattle genome sequencing initiative will exploit new strate¬ 
gies and enhanced technologies that will increase the fidelity of 
the sequence assembly and greatly reduce costs. The vast 
majority of the sequencing will likely be done in two laborato¬ 
ries, The Human Genome Sequencing Center at the Baylor 
College of Medicine, Houston, TX-USA, and the Genome 
Sciences Centre in Vancouver, Canada. Other laboratories may 
be involved in finishing specified regions and identifying single 
nucleotide polymorphisms (SNPs) so as to facilitate positional 
cloning of economically important genes and haplotypes. The 
proposed sequencing strategy will incorporate three subprojects 
that will result in an assembly with ~ 6x genome coverage: a 1 x 
clone coverage BAC skim at 1-2* coverage per BAC, BAC-end 
sequencing, and whole-genome shotgun sequencing at 5-6 x 
genome coverage using a variety of insert sizes (2 kbp up to 
50 kbp). 

Significantly, cattle will be the first species for which a com¬ 
paratively anchored whole-genome physical map is available 
prior to beginning the sequencing effort. The BAC skim will be 
selected from a whole-genome BAC contig created by finger¬ 
printing and BAC-end sequencing. This phase of the project is 
being conducted by The International Bovine BAC Map Con¬ 
sortium (IBBMC), an affiliation of laboratories with expertise 
in BAC fingerprinting, DNA sequencing, comparative genom¬ 
ics and bioinformatics. As of this writing, the fingerprinting 
and end-sequencing of the >192,000 clones in the CHORI-240 
BAC library (Hereford male) has been completed and the phys¬ 
ical map enhanced by the addition of BAC clones from the 
RPCI-42 library (Holstein male) and BAC libraries from INRA 
(France) and Texas A&M University (http://www.bcgsc.ca/lab/ 
mapping/bovine). The near-term objective is to provide a mini¬ 
mum tiling path of clones to the sequencing labs for clone-by- 
clone BAC skim sequencing at 1 x genome coverage. The whole- 
genome shotgun will be performed with genomic DNA from an 
inbred daughter of the Hereford bull used to make the CHORI- 
240 library. The reduced genomic variability in the combined 
shotgun and BAC skim will facilitate the correct assembly of 
gene families and duplicated segments. As discussed above, the 
integration of the physical map with the high resolution RH 
map will aid the correct ordering of contigs on the cattle chro¬ 
mosomes. 

The combined BAC skim-shotgun approach will cost an 
estimated $50 million, an order of magnitude less than the cost 
of sequencing the human genome. Advances in automation and 
sequencing technology are primarily responsible for the plum¬ 
meting cost of DNA sequencing. The cattle genome sequence 
will thus represent an incredible value in economic terms, as 


well as to science and agriculture. The combined strategy 
should also help overcome the sequence assembly problems 
presented by the large number and ubiquitous distribution of 
repetitive elements in the cattle genome. The technical lessons 
learned from sequencing the cattle genome will no doubt be 
useful for sequencing the genomes of other animals of agricul¬ 
tural, biological or evolutionary interest, such as the pig, aqua¬ 
culture species, monotremes and marsupials. 

To achieve the maximum practical gain from the sequenc¬ 
ing effort it will be useful to have a large number of SNPs and a 
comprehensive collection of ESTs and/or full length mRNA 
sequences. The >326,000 cattle ESTs in the public domain 
represent a powerful resource for annotation of the cattle 
genome sequence. At the present time, there is no funding with¬ 
in the current sequencing initiative for SNP identification and 
validation or full-length cDNA library construction. Such re¬ 
sources should be a priority for funding agencies that are inter¬ 
ested in applied genomics. Comparative and evolutionary stud¬ 
ies as well as practical applications, such as gene target identifi¬ 
cation and validation, will be greatly facilitated by the avail¬ 
ability of dense SNP maps and full-length cDNA libraries. 

Harvesting the promise 

What will be gained from sequencing the cattle genome? As 
pointed out in the white paper proposal to sequence the cattle 
genome (Gibbs et al., 2002) and elsewhere (Lewin et al., 2004) 
the cattle genome is anticipated to play an important role in 
annotating the human genome sequence because of the rela¬ 
tively deep divergence time of the most recent common ances¬ 
tor of ruminants and primates. An important example of the 
value of the phylogenomic approach in annotating human 
genome sequence can be found in the report by Thomas and 
coworkers (Thomas et al., 2003) who showed that coding 
regions and previously unknown conserved non-coding regula¬ 
tory elements can be identified from large-scale, multi-species 
DNA sequence alignments. With approximately half of all con¬ 
served sequence among mammalian genomes residing in non¬ 
coding DNA, it is essential to devise more robust methods to 
determine the role these sequences play in gene regulation. 

In animal biology, an early result of obtaining the cattle 
genome sequence will be the identification of tens if not 
hundreds of highly divergent homologs and novel mammalian 
genes (Larson et al., 2003; Lewin et al., 2004). These genes may 
provide a potent tool to investigate the origins of phenotypic 
diversity among the different orders of mammals and as a 
potential resource of genes that can be manipulated to create or 
enhance economically important phenotypes (Lewin et al., 
2004). In addition to differences in gene regulation (discussed 
above), genes that are under strong selective pressures, as iden¬ 
tified by increased non-synonymous substitution rates, may in 
part be responsible for the unique phenotypic and metabolic 
adaptations that distinguish the more than 4,600 extant mam¬ 
malian species from one another. For example, mammals range 
in size from a ~ 3 cm and 2 g body weight (Pygmy shrew) to 
more than 30 m and 172,300 kg (the blue whale, the largest 
animal on earth). They may live underground in hypoxic condi- 
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tions (mole rats) or at greater than 3,600 meters (the yak). The 
application of comparative genomics to identify genomic dif¬ 
ferences rather than similarities represents an important para¬ 
digm shift in mammalian genomics that has yet to be fully 
appreciated, in part because of the limited availability and phy¬ 
logenetic distribution of current mammal whole genome se¬ 
quences. As sequencing costs continue to come down, and more 
mammals from phylogenetically distinct clades enter the se¬ 
quencing pipeline, the subtractive comparative genomics ap¬ 
proach is likely to revolutionize evolutionary biology. 

The direct benefit of the cattle genome sequence to animal 
agriculture will likely come through application of “better-fas¬ 
ter-cheaper” technologies for SNP genotyping and DNA se¬ 
quencing. The complete genome sequence and large numbers 
of SNPs, both functional and non-functional, should pave the 
way to fine mapping most of the economically important QTLs 
and the identification of causative genes and mutations. Inex¬ 
pensive chips for resequencing QTL dense regions will impact 
our understanding of complex traits and the ability to manipu¬ 
late these traits using marker-assisted breeding. Breakthroughs 
in technology are needed to make these methods affordable for 
large-scale studies with cattle, but one can easily anticipate that 
such advances are just over the horizon. Finally, a robust anno¬ 
tation of the cattle genome sequence will be required for har¬ 
vesting its full potential. Not only are better methods required 
for annotating genes using bioinformatics, novel experimental 
systems must be developed to confirm the in silico predictions. 
Such methods are of prime importance for understanding the 
functions of divergent homologs and novel genes. Clearly then, 
investments must be made in high throughput in vitro systems 
for functional annotation. A promising method currently avail¬ 
able is RNA interference (RNAi) technology (McManus and 
Sharp, 2002), which has the potential to greatly accelerate a 
comparatively based and species-centric gene annotation sys¬ 
tem. Already, a number of bovine cell lines are in the public 
domain, as well as state-of-the-art cloning and embryo culture 
techniques that will be invaluable to the annotation process 
using RNAi technology. As the emphasis shifts back toward 
animal physiology, improving these biological resources will be 
necessary for the development of new products that enhance 
animal health and productivity. They will also be necessary for 
bringing cattle into the mainstream of biology, which is one of 
the more important sidebars of having your favorite organism’s 
genome sequenced! 

Functional genomics is the future 

A second major initiative that is likely to impact ruminant 
biology in a more immediate and direct sense is the further 
development and application of technology for gene expression 
profiling. Microrrays for transcript profiling are being devel¬ 
oped in parallel by many laboratories throughout the world for 
investigation of the classical physiological processes involved 
in animal production, such as growth, reproduction, lactation 
and immunity (Band et al., 2002; Coussens et al., 2002). The 
intersection of the effort in functional genomics with the DNA 
sequencing initiative will result in an explosion of new knowl¬ 


edge in cattle biology and a greater understanding of the 
genomic changes that are associated with the ruminant adapta¬ 
tion (Lewin et al., 2004). 

Although incredibly powerful, there are limitations to the 
widespread application of gene expression profiling technology 
to problems in bovine biology. The technical advantages/limi¬ 
tations of the three major currently used platforms, spotted 
cDNAs, spotted oligos and photolithography, are widely known 
and will not be discussed in detail here. Generally, there is a 
direct relationship between cost and sensitivity, with spotted 
cDNAs being the least expensive (at least in up-front costs) and 
the least sensitive, followed by spotted oligos and the photo¬ 
lithographic methods. Given current evidence, the spotted olig¬ 
os will likely be the technology of choice for the foreseeable 
future because of the reproducibility of the assays and lower 
costs of creating the microarrays over the long-term. Problems 
with oligo design can be overcome, provided the input se¬ 
quences are of sufficiently high quality. Another advantage of 
the spotted oligos is that it will be possible to rapidly incorpo¬ 
rate additional genes onto microarrays as the cattle genome is 
sequenced and annotated. 

The relative expense of conducting large microarray experi¬ 
ments makes it desirable to have the greatest genome coverage 
possible in the initial profiling studies. The “discovery ap¬ 
proach” flattens biases and will ultimately lead to better 
hypothesis-driven experiments. Used in conjunction with 
RNAi technology and appropriate bioinformatics tools, the 
microarray discovery platform should permit the identification 
of genes underlying QTL effects as well as a spectrum of new 
targets for nutritional and pharmacological intervention. In¬ 
vestments in appropriate biological resources, such as cell lines, 
tissue banks and disease models, will be necessary for the full 
realization of the potential of genomic biology. 

Arguably, one of the most important and difficult problems 
to be faced is the comparison of gene expression data across 
different species and extraction of meaningful biological infor¬ 
mation from these comparisons. The field of “comparative 
functional genomics” is just beginning, but it promises to pro¬ 
vide a data framework for explaining the inter-species differ¬ 
ences in cell, tissue and organismal biology. Although the chal¬ 
lenges are daunting, one can envision great advances in devel¬ 
opmental biology arising from comparing global gene expres¬ 
sion patterns, from the time of fertilization until parturition. 
The comparative functional genomic data, combined with 
information on novel genes, divergent homologs, conserved 
and non-conserved regulatory elements and species-specific 
annotation, will create a path to a golden era in mammalian 
biology. It should also provide the necessary data infrastructure 
for integration with the larger realm of genomic biology that 
includes other taxa. 


Systems biology and data integration 

Animal science is, by nature, systems biology. Animal scien¬ 
tists have long taken the view that an understanding of genetics, 
nutrition, immunology, reproductive and environmental phys¬ 
iology is necessary for the advancement of animal agriculture. 
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The essential difference in what is called systems biology today 
(Ideker et ah, 2001) and the traditional animal sciences is that 
the “new” systems biology begins at the genome level with the 
goal of predicting biological responses, whereas the traditional 
animal sciences approach usually begins with a phenotype or 
biological response and then aims to identify the underlying 
genetic or biochemical mechanisms. While it is possible that 
the new systems biology will never completely succeed in the 
real world of whole animal physiology, it is necessary to choose 
tractable experimental systems to address the simplest ques¬ 
tions first. For example, predicting the behavior of lympho¬ 
cytes after exposure to a pathogen or chemical agent on the 
basis of genomic information might be a good starting point. It 
is noteworthy that early “bottom-up” approach in modeling 
growth and lactation led by Lee Baldwin (Baldwin and Dono¬ 
van, 1998) and others had reasonable success given the amount 
of biochemical information available during the 1970s and 
1980s. It is imperative that the vast knowledge accumulated by 
the early systems modelers be integrated with the flood of data 
arising from genomics. As conventional modeling approaches 
merge with new genomic technologies, rare opportunities for 
discovery, data integration and application will arise. With the 
limitation on computational power largely dissipated, unique 
scientific opportunities presented by working with cattle can be 
revisited, for example, using cows for functional genomic stud¬ 
ies of lactation. 

Indeed, as more mammalian genomes are characterized, the 
rationale for an experimental animal model will be made on the 
basis of genomic similarity and disease phenotype rather than 
merely cost, convenience or purely phenotypic correlation. The 
relatively common example of curing cancer in mice but not 
humans with the same biotherapeutic agents should serve as a 
useful guidepost for deliberations within the agencies funding 
biomedical research. Thus, as comparative functional genom¬ 
ics and systems biology mature there may be a renaissance in 
the use of cattle and other livestock as biomedical models for 
research. In particular, gut immunobiology, lactation, and 
reproductive disorders (e.g., cystic ovaries) in cattle would 
seem extremely attractive research opportunities at this time. 

Finally, it is important to acknowledge that genomics may 
play an important role in securing the supply of meat and dairy 


products in the United States and beyond. One important area 
is the traceability of milk and meat products and a second is 
bioterrorism. The threat of bioterrorism is very real and cannot 
be ignored. Deliberate exposure to a number of infectious 
agents, such as Foot and Mouth Disease Virus and the prion 
agent of Bovine Spongiform Encephalitis, can have devastating 
economic and social consequences, as recently witnessed in the 
United Kingdom. Genomics-based methods for monitoring 
early exposure to infectious agents should be targeted for 
research funding, particularly those that have acute onset. 
Clearly, the key element in containment of infectious agents is 
early detection. Gene and protein expression profiles appear to 
be particularly promising as sentinels for exposure to animal 
and human pathogens. 

Conclusion 

At no time in recent years has dairy and beef cattle research 
been so ripe for major discoveries. All of the traditional disci¬ 
plines within the animal sciences and veterinary medicine will 
be directly affected by the availability of the cattle genome 
sequence and the application of transcription profiling and 
allied technologies for functional genomics. When the genome 
sequencing effort is completed, potential resources for func¬ 
tional genomics will reach their zenith. The discoveries made 
during the next decade with microarray technology, RNAi, pro- 
teomics and other genome-enabled methods can completely 
transform our understanding of cattle physiology and will 
undoubtedly lead to improved nutritional and reproductive 
efficiencies, enhanced animal breeding strategies and novel 
immunotherapeutics. 
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Abstract. Our on-going goal is to improve and update the 
comparative genome organization between cattle and man but 
also among the most detailed mammalian species genomes i.e. 
cattle, mouse, rat and pig. In this work, we localized 195 genes 
in cattle and checked all human/bovine non-concordant local¬ 
izations found in the literature. Next, we compiled all the genes 
mapped in cattle, goat, sheep and pig (2,166) for which the 
human ortholog with its chromosomal position is known, add¬ 
ed corresponding data in mouse and rat, and ordered the genes 


relatively to the human genome sequence. We estimate that our 
compilation provides bovine mapping information for about 
89% of the human autosomes. Thus, a near complete, overall 
and detailed picture of the number, distribution and extent of 
bovine conserved syntenies (regardless of gene order) on hu¬ 
man R-banded autosomes is proposed as well as a comparison 
with mouse, rat and pig genomes. 

Copyright©2003 S. Karger AG, Basel 


A major goal of livestock genomics is to map and identify 
genes involved in economically important traits and disease 
susceptibility and resistance. It requires powerful genome re¬ 
sources i.e. BAC and YAC libraries, radiation hybrid cell 
panels, detailed genomic maps and comparative maps. In 
recent years, human (International Human Genome Sequenc¬ 
ing Consortium, 2001; Venter et al., 2001), mouse (Mouse 
Genome Sequencing Consortium, 2002) and rat (Rat Genome 
Project, 2003) genomes have been sequenced to near comple¬ 
tion. This has provided an essential source of data for under¬ 
standing genome organization and evolution, comparing ge¬ 
nomes, identifying unknown genes, and analyzing gene expres¬ 
sion and regulation. The cow is one of the economically most 
important species and over the past years, considerable work 
has been done to create detailed bovine gene maps but its 
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genome has yet to be sequenced. In this work, our aims were (1) 
to increase the number of genes mapped in cattle (195 new 
localizations), (2) to solve human/bovine non-concordant local¬ 
izations currently found in the literature and (3) to update the 
organization of the cattle genome in relation to human, mouse, 
rat and pig. 


Materials and methods 

Primer pairs and probes for FISH mapping 

Bovine YAC (Libert et al., 1993) and BAC clones (Eggen et al., 2001) 
were used to FISH-map 37 genes (see Table 3). They were obtained by PCR 
screening with three-dimensional pooling schemes as described in the respec¬ 
tive publications using primer pairs listed in Tables 1 and 2. In addition, 151 
caprine BAC clones isolated by Schibler et al. (1998) and corresponding to 
151 genes (Table 3) were FISH mapped. 

Chromosomal assignments on the INRA bovine * hamster somatic cell 

hybrid panel (SCH) 

Chromosomal assignments for seven of the genes (see Table 3) were 
obtained by PCR analysis of the INRA bovine * hamster hybrid panel as 
described by Laurent et al. (2000). 

Fluorescent in situ hybridisation (FISH) 

Chromosome preparations, DNA labelling, FISH, and R-banding are 
described in Hayes et al. (1992, 2000). Chromosome and band numbering 
followed ISCNDB (2000). 
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Table 1. List of genes localized with bovine YAC clones and corresponding primers. Thirty-five YAC clones were selected for 
22 genes from human chromosomes 21 and 3 using specific primer pairs defined in conserved exons of the orthologous genes in 
man, cattle, sheep, pig, mouse and/or rat. Gene identity controls for bovine PCR products obtained with human or ovine primers 
revealed 88-100% sequence similarity. Note: 35 independent YAC clones were isolated but since 5 clones, each contained two 
genes, the total number in Table 1 amounts to 40. 


Gene 3 

Number of YAC Origin 

clones b isolated and 
hybridized/gene 

PCR Forward primer 

product 

size (bp) 

Reverse primer 

APP 

3 

bovine 

99 

GC AG AAGACGT GGGTT CC 

CTT CAGCATCACCAAAGTTGA 

IFNAR1 

1 

bovine 

290 

AG A AGTTTT CT GCGT CCTTT GCC 

T GAT GGTGGT ATT C AGGTT CTTC 

TIAM1 

1 

human 

147 

GAGCCGAAAGGATTTCCTAAAG 

GGCCTGGCCCCCTTCAGTG 

ATP50 

2 

bovine 


AACCT GAT C AATTT GCTTGCT GA 

GT AACT GT GC AT GGT ACTT CT CCA 

GRIK1 

3 (lx nc) 

human 

130 

GTTTTTCACCCTAATCATCAT 

TGCCCCATATTCTATCTTG 

KCNE1 

3 

human 

79 

T GGG ATT CTT CGGCTT CTT C A 

GGGT CGTT CGAGT GCT CCAG 

SON 

3 

human 

104 

GCTGTAACCAACTTATCTAA 

G A AAT ATT CT AT C AG ACCCT A 

SLC5A3 

2 (lx nc) 

bovine 


AGGCTCT GCT CAT GAT CGTT 

GGTTCCGCAACATTTTCAGT 

SOD1 

2 (2x nc) 

bovine 

280 

GTTT GGCCT GT GGT GT AATT GGA A 

GGCCAAAATACAGAGAT GAAT GAA 

PRSS7 

2 (lx nc) 

bovine 

97 

ATT CAGCAAATGATAGAT GAT 

CC ACT GGT C AAAGAG AAG 

POU1F1 

1 

bovine 

150 

GT AGTTTAACCCCTT GT CTTT AT 

TTGGCTCTTCCACCAATTTACTT 

GAP43 

l(nc) 

human 

203 

GAT CCCAAGTCAAACAGTGTG 

TCAGATGAACGGAACATTGC 

HES1 

2 (2x nc) 

human 

153 

ATT GGCT GAAAGTTACT GT GG 

GAGGTAGACGGGGGATTC 

KNG 

1 

bovine 

286 

CCTACTTCAGTTTTCTGAT 

GAGATTACTAGCCCATTTTGGAA 

SIAT1 

1 

bovine 


GCGTATTTTCCTGCTCAGAACAGC 

CCGGG AGG ACTT C AG AG AT CCTG 

ERG 

2 

human 

295 

CACCAACGGGGAGTTCAA 

CGCC AC A A AGTT CAT CTT CTG 

MX2 

2 

ovine 

94 

GGCCCT GC ATT GACCTC AT C 

GAGCTCTGGTCCCCGATAACG 

MX1 

2 

ovine 

94 

GGCCCT GC ATT GACCTC AT C 

GAGCT CTGGT CCCCGAT AACG 

RUNX1 

2 (lx nc) 

human 

135 

CCT GT CGCCGT CT GGTAGGAG 

GCTCATCTTGCCTGGGCTCAG 

TFF3 

1 

human 

72 

GCGGCTACCCCCATGTCAC 

AC ACC A AGGC ACT CC AGGG AT 

CBR1 

2 

human 

125 

AGCAGAGGAAAGGGGACAAGA 

GGCCAAGTACACAGGGGTCTC 

HMGN1 

1 (nc) 

human 

160 

T CTT GTAC AATCC AG AGG AAT 

AATAAACAACCAGCAAATGAT 


a Gene symbols are from the HUGO Nomenclature database (http://www.gene.ucl.ac.uk/nomenclature/). 
nc = non chimeric bovine YAC clones. 


Table 2. List of genes localized by FISH with bovine BAC clones (F) or by somatic cell hybrid mapping (SCH) and corresponding primers. Bovine BAC 
clones for 15 genes were isolated with primers defined from bovine or human sequences. Regions sharing a high sequence similarity with the orthologous 
human gene were determined using the Iccare program (T. Faraut; http://genopole.toulouse.inra.fr/Iccare). Gene symbols are from the HUGO Nomenclature 
database except when no official nomenclature is available to date as indicated by an asterisk. 


Gene 

Origin 

GenBank 
accession no. 

PCR 
product 
size (bp) 

Forward primer 

Reverse primer 

Mapping 

procedure 

CCT8 

bovine 

AF136609 

150 

GTGCCCTGGACTTGAACAGTA 

CCAACGTTTTTATTTCCTTCTTG 

F 

COL6A1 

human 

X99136 

282 

TCTCCTCCCCGGCTGACATCACC 

TCGTTGACGTCGGTGGCGTCGTTG 

F 

SMP1* 

human 

NM014313 

102 

TTACCTTTAGCAACATAACCTC 

TTGAAGGTGCTCGCATTTGGCT 

F 

COP9* 

bovine 

AW464695 

165 

AGCAGCAGTTAGCCAGACTCA 

TCAGTAAAGTAT GCCGAGGT GA 

F 

COL6A3 

bovine 

AW356722 

200 

T C A AT ACCT ACCCC AGC A AG A 

GACCGCATCTAGGGACTTACC 

F 

CDC20 

bovine 

BF193752 

201 

AC AT CC ACC ACC AT GACGTT 

CTT GAT GCT GGGT GAAGGT CT 

F 

IVL 

bovine 

AV618666 

106 

C AC ACT CT GCC AGT GATT C AG 

T G AAGTT GGCTT GCTT C ACTT 

F 

SSBP1 

bovine 

BE722675 

100 

AGTTGCTCGGTCGAGTAGGTC 

GATCGCCACATCTCATTTGTT 

F 

HADHSC 

bovine 

BM481145 

141 

AT GAGTTT GT GGCGAAG ACC 

ACTTGTCCAGCCTCTTGAACA 

F 

VAV1 

bovine 

BE485884 

100 

CTACC AAC AGAACT CT CT GAAGGA 

TGATGGTTTGTTGATGGCTCT 

F 

HSPA4 

bovine 

BI537416 

196 

T G ACG AGT AT CT GC AGCCCT A 

T C A AT GT CC ATTT C AGG AAGC 

F 

RAB18 

bovine 

BM967938 

168 

AC A AGGT CT GGTTTT GGCT CT 

CATCCTAGGAAAGCTGTGGAA 

F 

PSMA7 

bovine 

BF045709 

112 

CCT C ACT GCT GAT GCA AG A AT 

GGCGATAT AACGGGT GAT GT A 

F 

GNAOl 

bovine 

BM286134 

182 

CGACACCAACAACATCCAGTT 

T GT CAAGGGCAGAGAGTGG 

F 

SERPINO1 

bovine 

AV606014 

102 

T CACCATGGAGCAGCTGAG 

AGGGGG AG AT G A AG AT GTTT C 

F 

TRIM 17 

bovine 

BE668983 

176 

T CTT CCT GGATTTT GAAGCT G 

CT AT CCCTT C ACCC AC AG AGT C 

SCH 

ELF2 

bovine 

AW463485 

104 

GCT GAGCCTT AACTT CCG AGT 

CT GGCCTTTT GCTAT AAC ACG 

SCH 

IL6R 

bovine 

BE685522 

121 

C AGGAGCCCT GCC AGT ATT 

ACTTGTTCCCGGCACTGTT 

SCH 

RAB33B 

bovine 

AV594098 

199 

CAAT GACATGCTGGT GCTAAA 

GT GT CAC AGCGAC AAACTT GA 

SCH 

OSBPL8 

bovine 

BE665912 

174 

CGGGTAACTCGAGCCATAAAT 

TGCAAACTTGTAATGCCATTCT 

SCH 

ARF1 

bovine 

AV601289 

147 

G ACCT CCCT AAT GCC AT G A AC 

GAGCT GATTGGACAGCCAGT 

SCH 

SAG 

bovine 

J02955 

189 

GT ACGT GT CT CT GACGT GT GC 

AGCAGGAAGGGGTAGGT GTT 

SCH 

ACK1 

bovine 

BG690194 

194 

CT GT CT CC AC AAGGCT CC AG 

ACGAT GGGCAAGATGCAG 

F (unexpected) 

NDUFC1 

bovine 

X63214 

106 

C CGT A A AGT GATT AT AGC AGTTCC 

A A AAC AC A A AT GCT G ACTT GAC A 

F (unexpected) 

NDUFV2 

bovine 

M22539 

157 

GCAGAAATTTTACAAGTACCTCCA 

T CT G A AT GGCTT CC AGT AT GC 

F (unexpected) 
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Table 3. List of new gene localizations on bovine chromosomes obtained by FISH and somatic cell hybrid mapping in this study. Normal characters FISH 
with caprine BACs, bold italic with bovine YACs, bold normal with bovine BACs, in shaded background somatic cell hybrid mapping. 


Gene 

Localization 

Gene 

Localization 

Gene 

Localization 

Gene 

Localization 

Gene 

Localization 

KRTAP8* 

1 q 12.2 

GDF8 

2 ql2.2 

CSN2 

6q32 

EDNRB 

12 q22 

C9 

20ql7 

APP 

lql2.2 

EN1 

2q33 

CSN1S1 

6q32 

IL2RA 

13ql3med 

SLC6A3 

20q24 

SOD1 

lql2.2 

GLI2 

2q33 

CSN1S2 

6q32 

ITGB1 

13q 13dist 

UBE3A 

21 q 12 

IFNAR1 

lql2.2 

SLC11A1 

2q43 

PDE6B 

6q36 

VIM 

13q 14 

MEF2A 

21 q 13 

TIAM1 

lql2.2 

PAX 3 

2q43 

VAV1 

7ql5prox 

RAB18 

13ql5prox 

CHRNA7 

21 q 17dist 

ATP50 

lql2.2 

AK2 

2q45prox 

GM2A 

7q21 

PSMA7 

13q22prox 

GRP 5 8 

21q23-q24 

GRIK1 

lql2.2 

SMP1* 

2q45prox 

HSPA4 

7q22.1 

ASIP 

13q22dist 

SERPINA3 

21 q24prox 

KCNE1 

lql2.2 

CRP 

3q 13 

RASA1 

7q25.2 

ADA 

13q24prox 

CHGA 

21 q24prox 

SON 

lql2.2 

S100A6 

3q21 

CAST 

7q27 

CYP11B1 

14q 13 

SERPINA1 

21 q24prox 

SLC5A3 

lql2.2 

THH 

3q21 

TRIM 17 

7 

TG 

14q 15 

MITF 

22 q22 

CCT8 

lql2.2 

IVL 

3q21 

ARF1 

7 

MYC 

14q 15 

PBXP1 

22 q22 

PRSS7 

lql4 

NGFB 

3q23 

GALT 

8 q 13 

CRH 

14q 19 

GPX1 

22q24prox 

POU1F1 

lq21dist 

NRAS 

3q23 

VLDLR 

8 q 17 

MMP1 

15ql2prox 

HRH1 

22q24dist 

GAP43 

lq24 

TSHB 

3q23 

SFTPC 

8 q21 dist 

FDX1 

15q21prox 

GSTA1 

23q22prox 

NDUFB4 

lq31prox 

UOX 

3q31-q32.1 

CTSL 

8q25 

APOA1 

15q21 

BF 

23q22 

CASR 

lq31 

ACADM 

3q32.2 

GSN 

8q28 

HBB 

15q25prox 

OLADRB 

23q22 

ZNF148 

lq31 

CDC20 

3q35 

COL9A1 

9ql2.2 

FSHB 

15q25dist-q26 

EDN1 

23q24prox 

UMPS 

lq31 

COP9* 

3q37 

AMD1 

9q 16prox 

PAX6 

15q27 

F13A1 

23q24dist 

HES1 

lq31dist 

COL6A3 

3q37 

CGA 

9q22 

WT1 

15q27 

SERPINOl 

23q24dist 

CRYGS 

lq33 

IL6R 

3 

HMGCR 

1 Oq 12 

PGD 

16q21prox 

CYB5 

24ql2 

AHSG 

lq33 

SAG 

3 

MYH6 

1Oq15—q21 

NPPA 

16q21 

DSG2 

24q21-q22 

KNG 

lq33 

HGF 

4ql5dist-q21 

MYH7 

1Oq15—q21 

LAMC2 

16q23 

ADCYAP1 

24q23 

SI ATI 

lq33 

LAMB1 

4q22 

HEXA 

1 Oq 15dist 

IL2 

17q22dist 

HBA1 

25ql2prox 

CP 

1 q41 dist 

NPY 

4q25-q26 

NP 

10 q21 

NOS1 

17q25 

EPO 

25q22 

AGTR1B 

lq42 

IGFBP3 

4q26 

THBS1 

10 q22 

COMT 

17q26 

ACTA2 

26ql3 

GYG 

lq42 

OPN1S W 

4q32 

MGAT2 

10q24 

ELF2 

17 

CYP17 

26q21 

RBP1 

lq43 

CLCN1 

4q34 

TPM1 

10q26 

RAB33B 

17 

DNTT 

26q21 

NCK1 

lq43 

SSBP1 

4q34dist 

CYP19 

10q31 

DPEP1 

18q 13 

PAX2 

26q21 

TFDP2 

lq43prox 

KRTB@ 

5q21 

SORD 

10q32 

MC1R 

18q 13 

OAT 

26q23prox 

TF 

lq43dist 

AVPR1A 

5q23 

SPTB 

10q34prox 

MT2A 

18q 15 

DEFB1 

27ql3 

MX1 

lq45prox 

IFNG 

5q23 

TGFB3 

10q34dist 

GNAOl 

18ql5 

FI 1 

27ql5 

MX2 

lq45prox 

IGF1 

5q31prox 

TGM1 

10q34 

RYR1 

18q24prox 

ANK1 

27ql9 

TFF3 

lq45prox 

FGF6 

5q35prox 

TGFA 

11 q 14 

PTGIR 

18q24dist 

PLAT 

27ql9 

HMGN1 

lq45prox 

OSBPL8 

5 

IL1B 

1 lq22 

LHB 

18q24dist 

RBP3 

28q18—q19 

CRYAA 

lq45med 

UGT8 

6 q 13 

CAD 

1 lq24dist 

ACACA 

19q 13 

AGT 

28ql9 

COL6A1 

lq45med 

TXK 

6ql4 

POMC 

1 lq24dist 

MYH2 

19q15—q16 

TYR 

29ql3 

CBR1 

lq45med 

MTP 

6 q 15 

ASS 

1lq28prox 

GAS 

19q 17 

OPCML 

29q22 

RUNX1 

lq45med 

HADHSC 

6ql5prox 

BRCA2 

12 q 15 

PNMT 

19q 17 

COX8 

29q24prox 

ERG 

lq45dist 

GNRHR 

6q32prox 

SGCG 

12ql5dist 

MAPT 

19q22prox 

LDHA 

29q24prox 


Results and discussion 

Mapping of 195 genes on bovine chromosomes by FISH and 

SCH analysis 

The gene mapping results (188 by FISH and seven by SCH 
analysis) are summarized in Table 3. New or refined gene local¬ 
isations were obtained for all the bovine autosomes although 
gene density varies among chromosomes. Concerning the dis¬ 
tribution of the genes in relation to chromosomal bands, most 
of those that we have mapped by FISH are located on R-posi- 
tive bands confirming the general trend that R-positive bands 
are gene richer than R-negative bands. 

Resolving non-concordant chromosome assignments 

between man and cattle 

At the beginning of our work, we had listed 37 discrepancies 
(Table 4) between the bovine and human gene maps based on 


existing comparative data. In order to establish the most accu¬ 
rate bovine/human genome comparison, these discrepancies 
were checked to determine whether they were due to true novel 
synteny groups, to mapping errors in the human or bovine 
maps or to false orthologous gene pairs. Based on our results 
(details in Table 4), these discrepancies were sorted into seven 
classes: (a) Two genes (GUK1 and SAG) belonging to two new 
synteny groups i.e. between HSA1/BTA7 and HSA2/BTA3; 
(b) two genes (SKI and PDE1A) originally mismapped on the 
human map and repositioned following the sequencing of the 
human genome; (c) five genes (CDC20, FABP3, IVL, COP9, 
CCT8) mismapped on the bovine map and remapped at 
expected localizations in this work; (d) eight genes (IL6R, 
HSPA4, SERPINB1, SSBP1, VAV1, HADHSC, RAB18, 
PSMA7) originally incorrectly identified and remapped either 
by FISH or SCH analysis at expected localizations in this work; 
(e) ten genes (SMP1, GAPDL, GLUL, ACTR2, PABPL1, 
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Table 4. BTA/HSA inconsistent localizations examined in this study. Note: localization of underlined genes (see multispecies comparative table) confirms 
or supports expected localization of neighbouring gene with grey background. 


Gene d 

HSA 

map 

MMU 

map 

RNO map 

BTA map 

published expected 

Reference and mapping mode b 

BTA localization (this work) when available and comments 

GUK1 

1 q42.13 

11 B2 

(10q22) 

7 

28/16 

Band et al. (2000) 

RH (new synteny group confirmed) 

No BAC but GUK1 on BTA7 defines a new synteny group 
confirmed bv the localization of TRIM 17 and ARF1 on BTA7 (this 
work. Table 3) 

SAG 

2q37.1 

1C5 

9q34 

3 

2 

Band et al. (2000) 

RH (new synteny group confirmed) 

SAG assigned to BTA3 by SCH and defines a new synteny group 
confirmed bv the localization of COP9* and COL6A3 on BTA3 
(this work, Table 3) 

SKI 

lp36.32 

4E2 

(5q36) 

16q21 

16/33 

Sonstegard et al. (2000) 

F (previously SKI on HSAlq22-q24) 

If SKI on HSAlp36.22 (cf UCSC database) OK with BTA16 

PDE1A 

2q32.1 

2D 

3q23 

2 

2/6 

Barendse et al. (1997) 

L (previously PDE1A on HSA4) 

If PDE1A on HSA2 (cf UCSC database) OK with BTA2 

CDC20 

lp34.2 

4D1 


6 

3 

Band et al. (2000) RH (mapping error) 

CDC20 FISH-mapped to BTA3q35 (this work, Table 3) 

FABP3 

lp35.2 

4D2.3 

5q36 

6 

2 

Barendse et al. (1997) L (mapping error) 

FABP3 FISH-mapped to BTA2q45 (pers communication) 

IVL 

1 q21.3 

3F2 

2q34 

lq41-46 3 

Schmutz et al. (1998) 
radioactive ISH (mapping error) 

IVL FISH-mapped to BTA3q21 (this work, Table 3) 

COP9* 

2q37.3 

(1C5) 

(9q34) 

25 

2/3 

Band et al. (2000) RH (mapping error) 

COP9 FISH-mapped to BTA3q37 (this work, Table 3) 

CCT8 

21 q21.3 

16C3.3 

(Hq22) 

7 

1 

Band et al. (2000) RH (mapping error) 

CCT8 FISH-mapped to BTAlql2.2 (this work, Table 3) 

IL6R 

lq22 

3F2 

2q34 

19 

3 

Barendse et al. (1999) L (no bovine 
sequence so gene identification?) 

IL6R assigned by SCH to BTA3 (this work, Table 3) 

HSPA4 

5q31.1 

11B1.3 

10 q22 

3ql3 

7 

Gallagher et al. (1993) FISH (no bovine 
sequence but the gene mapped in this 
paper is probably HSPA6) 

HSPA4 FISH-mapped to BTA7q22.1 (this work, Table 3) 

SERPINO1 

6p25.2 

13A4 

17p 12 

21 

23 

Georges et al. (1990) L (no bovine 
sequence so gene identification?) 

SERPINO 1 FISH-mapped to BTA23q24 (this work, Table 3) 

SSBP1 

7q34 

6B2 

4q23 

2 

4 

Barendse et al. (1997) L (no bovine 
sequence so gene identification?) 

SSBP1 FISH-mapped to BTA4q34 (this work. Table 3) 

VAV1 

19p 13.3 

17E1.1 

9ql1—ql2 

28 

7 

Barendse et al. (1999) L (no bovine 
sequence so gene identification?) 

VAV1 FISH-mapped to BTA7ql5 (this work, Table 3) 

HADHSC 

4q25 

3H1 

2q42 

26 

6 

Band et al. (2000) RH (Ace no 

AW289352 gives 93% sequence 
similarity on 87 bp so gene 
identification?) 

HADHSC FISH-mapped to BTA6ql5 (this work, Table 3) 

RAB18 

1 Op 12.1 

18A1 

(17q 12.1) 

9 

13 

Karall-Albrecht et al. (2000) SCH (Ace no 
AI461405 gives 83% sequence similarity 
on 184 bp so gene identification?) 

RAB18 FISH-mapped to BTA13ql5 (this work, Table 3) 

PSMA7 

20ql 3.3 

2H4 

3q43 

4 

13 

Karall-Albrecht et al. (2000) SCH (Ace no 
AI461430 gives 93% sequence similarity 
on 99 bp so gene identification?) 

PSMA7 FISH-mapped to BTA13q22 (this work, Table 3) 

SMP1* 

lp36.ll 

4D3 

5q36 

14 

2 

Band et al. (2000) 

RH (Ace no U89254 in fact - RGS20) 

SMP1* FISH-mapped to BTA2q45 (this work. Table 3) 

GAPDL 

2 q 11.2 



11 


Barendse et al. (1999) 

L (gene identification?) 

No GAPDL in human database, in fact probably GAPDL3 on 

HSA2q 11.2 OK with BTA11) 

GLUL 

1 q25.3 

1G3 

(13q21) 

10 

16 

Masabanda et al. (1997) GLUL and 
GLULP FISH-mapped with sequences 

Ace no Y10347 (BTA 10) and Y10348 
(BTA 16), respectively 

In fact, Y10348 = GLUL (not GLULP) on HSAlq25.3 OK with 

BTA16q21. 

ACTR2 

2pl4 

(11A3.2) 

(14q22) 

3 

2/11 

Band et al. (2000) 

RH (Ace no U83023 # ACTR2) 

In fact, U83023 - GTF2B on HSAlp22.2 OK with BTA3 

PABPL1 

3q25.2 



14 

16 

Band et al. (2000) 

RH (Ace no U83076 # PABPL1) 

In fact, U83076 = PABPC1 onHSA8q22.3 OK with BTA14 

ADCY2 

5p 15.31 

13C1 

17p 14 

15 

20 

Amarante et al. (1999) 

FISH (in fact, mapped NCAM1, ref d ) 

NCAM1 on HSA1 lq23.1 OK with BTA15 
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Table 4 (continued) 


Gene d 

HSA 

map 

MMU 

map 

RNO map 

BTA map 
published 

expected 

Reference and mapping mode 

BTA localization (this work) when available and comments 

CACN03 

12ql 3 

15F2 

7q35 

14 

5 

Band et al. (2000) 

RH (Acc no AW266991 # CACN03) 

In fact, AW266991 = MAF1 on HSA8q24.3 OK with BTA14 

PLCG2 

16q23.3 

8 E1 

19q 12 

13 

18 

Schlapfer et al. (1997) 

SCH (Acc no Y00301 # PLCG2) 

In fact, Y00301 - PLCG1 on HSA20ql2 OK with BTAI3 

CSH1 

17q23.3 

11E1 

10q32.1 

23 

19 

Dietz et al. (1992) 

L (Acc no J02840 # CSH1) 

In fact, J02840 - PRL on HSA6p22.3 OK with BTA23 

MSF 

17q23.2 

11E2 

(10q26) 

16 

19 

Band et al. (2000) 

RH (Acc no AF056218 # MSF) 

In fact, AF056218 = PRG4 on HSAlq31.1 OK with BTA16 

NDUFC1 

4q31.1 

3D 

(2) 

12 

17 

Band et al. (2000) RH (member 
identification? of gene family) 

NDUFC 1 FISH-mapped again to BTA12ql5 but expected 
localization on BTA17 supported bv SCH mapping of ELF2 and 
RAB33A flanking NDUFC 1 (this work. Table 3) 

NDUFV2 

18pl 1.2 

17E1.2 

(9q38) 

5 

24 

Barendse et al. (1997) L (member 
identification? of gene family) 

NDUFV2 FISH-mapped again to BTA5q35 but expected 
localization on BTA24 supported bv RH mapping of TWSG1 
(AW267141 on BTA24, Band et al, 2000) just below NDUFV2 

NAP1L1 

12 q21.2 

10 D1 

7q21 

13 

5 

Ma et al. (1998) SCH (member 
identification? of gene family or mapping 
error?) 

No BAC but expected localization on BTA5 supported by SCH 
mapping of OSBPL8 (BTA5, this work Table 3) just below NAP 1 LI 

UBE2I 

16p 13.3 

17 A3.3 

1 Oq 12 

6q34 

25 

Antoniou & Gallagher (2002) FISH 
(member identification? of gene family) 

No BAC but expected localization on BTA25 supported by RH 
mapping of TPSB1 (BTA25, Band et al, 2000) just above UBE2I 

GNAZ 

22 ql 1.2 

10B5.3 

20 pl2 

22 

17 

Aleyasin and Barendse (1997) 

L (gene identification?) 

(FISH of "GNAZ" on BTA18ql5? in fact = GNAOl) see text 

ACK1 

3q29 

16B2 

(llqll) 

6 

1 

Band et al. (2000) 

RH (Acc no U96722 = ACK1) 

ACK1 FISH-mapped to BTA15q23? open question 

GDH 

lp36.22 

(4E1) 

(5q36) 

5 

16 

Monteagudo et al. (1992) 

Womack et al. (1986) SCH 

No bovine sequence, no primers, no BAC, open question 

NRGN 

1 lq24.2 

9B 

8 q21 

10 

15 

Band et al. (2000) 

RH (Acc no S78295 - NRGN) 

Primers but no BAC, open question 

FKSG17* 

8q22.3 

(4) 

(11) 

8 

14 

Goldammer et al. (2002) 

SCH + RH (gene identification?) 

No BAC, open question 

PTGDS 

9q34.3 

2A.3 

3P13 

1 


Roncoleta et al. (2002) 

No BAC, open question 


11 SCH & RH (Acc no AB004647 = 

PTGDS) 


Gene symbols are from the HUGO Nomenclature database except when no official nomenclature is available to date as indicated by an asterisk. 

SCH = somatic cell hybrid mapping, RH = radiation hybrid, ISH = in situ hybridisation, FISH = fluorescent ISH, L = linkage mapping, reference: Gautier et al. (2002). 


ADCY2, CACNB3, PLCG2, CSH1, MSF) originally misiden- 
tified but for which bovine sequences were available allowing 
us to recover concordant orthologous human/bovine gene pairs 
and localizations by BLAST analysis of corresponding acces¬ 
sion numbers; (f) four genes (NDUFC1, NDUFV2, NAP 1 LI, 
UBE2I) for which close neighbouring genes support their 
expected localization and not that reported in the literature. 
These genes are members of large gene families sharing se¬ 
quence similarities, which may impede identification of true 
orthologous gene pairs; (g) six genes (GNAZ, ACK1, GDH, 
NRGN, FKGS17, PTGDS) for which informative data could 
not be obtained. For the GNAZ gene, we isolated two bovine 
clones both found located on BTA18ql5 but verification of the 
primers used (Table 2) revealed that in fact we had selected 


clones for GNAOl on HSA16ql3 in agreement with the local¬ 
ization on BTA18. Furthermore, Pinton et al. (2000) have 
shown that the caprine “GNAZ” BAC clone (BTA/CHI22) 
maps to HSA3p21.3 in agreement with comparative mapping 
data between BTA22 and HSA3, which suggests that this clone 
may not contain the GNAZ gene. Finally, the ten genes of 
classes (f) and (g) were not included in our analysis because 
comparative mapping data were concordant among man, 
mouse and rat but not with cattle and we consider that a single 
gene is not sufficient to support the existence of a novel synte- 
ny group. In addition, the existence of paralogs and pseudo¬ 
genes sharing sequence similarities with a given gene can make 
it difficult to establish the complete and true comparative 
chromosome organization among species. Furthermore, as dis- 
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cussed by Ozawa et al. (2000) for recently duplicated genes a 
true orthologous gene may not exist in one of the compared 
species. 

Comparative mapping analysis 

All genes and ESTs mapped to date in cattle, sheep, goat and 
pig (2,166) were compiled with corresponding data in man, 
mouse and rat (see Multispecies Comparative Table accessible 
online at http://locusjouy.inra.fr/) and a detailed human/cat¬ 
tle/mouse/rat comparative map (Fig. 1) was drawn to propose a 
direct visualisation of the distribution of conserved chromo¬ 
somal segments among these four species. Several observations 
can be derived from our data: 

1. The 151 FISH localisations in cattle obtained with 
caprine BAC clones were all (except one) concordant with those 
reported in goat by Schibler et al. (1998). This further confirms 
the high level of genome conservation between the two species 
and more generally among the three main domestic bovidae, 
cattle, goat and sheep. It also supports our decision of inferring 
bovine gene localisations from those mapped only in sheep or 
goat. Only the COF9A1 gene was found on non-homoeologous 
bovine and caprine chromosomes i.e. BTA9ql2.2 and 
CHI14ql 1 —> q 12. The findings confirm a translocation involv¬ 
ing a segment equivalent to the centromeric region of CHI 14 or 
OAR9 (the ovine counterpart) to the centromeric region of 
BTA9 during evolution from ancestral chromosomes (Vaiman 
et al., 1996). 

2 . Correspondences between human and bovine autosomes 
proposed on the basis of mapped genes agree nearly completely 
with previous results obtained by heterologous chromosome 
painting of bovine chromosomes with human individual paints 
(Hayes, 1995; Solinas-Toldo et al., 1995; Chowdhary et al., 
1996). We show the existence of two new synteny groups i.e. 
(1) between HSAlq42.13 and BTA7ql2 and (2) between 
HSA2q37.1 ^q37.3 and BTA3q37. Interestingly, group (1) is 
supported by comparative mapping data i.e. HSA1/BTA7/ 
MMU11/RNO10, a combination also found on HSA19p and 
leads us to question the possibility that the telomeric tip of 
HSAlq44/MMUl 1/RNO10 may also be conserved with 
BTA7. At present, we have not succeeded in isolating BAC 
clones for this chromosomal segment. Group (2) is located at 
the telomeric end of both HSA2 and BTA3. These two new 
synteny groups are particularly interesting since they cover 
chromosome segments BTA7ql2 and BTA3q37 not painted by 
any human chromosome paint (Hayes, 1995) and situated pre¬ 
cisely in pericentromeric and telomeric regions with few known 
mapped genes. It also suggests that when the correspondence of 
other such small segments (BTA3q 12, 4ql2 —>ql 3, 4q36, 8q 12, 
25q24, 28ql2->ql3) with human chromosomes are deter¬ 
mined, it may reveal conserved syntenies unknown up till 
now. 

3. Based on Fig. 1, the coverage of bovine autosomes is esti¬ 
mated at -76% if putative segments and centromere inter¬ 
ruptions within the same segment are not considered and 
~ 89% if they are included. Most of the putatively identified 
conserved regions are associated with R-negative bands known 
to be gene poor and the empty regions with pericentromeric, 
telomeric and satellite regions known to be difficult to map 


and characterize because they contain specific repetitive se¬ 
quences. 

4 . Figure 1 shows 84 bovine synteny segments conserved on 
human autosomes, regardless of gene order and excluding 
interruptions by human centromeres within the same bovine 
chromosome. Eight of the 84 conserved segments are supported 
by only one gene mapped in cattle (Multispecies Comparative 
Table) but were retained because identical synteny correspon¬ 
dences existed along the given human chromosome suggesting 
that minor reshuffling between these genome regions had 
occurred during evolution. These 84 bovine/human conserved 
segments represent a higher number than those in previous 
reports ranging from 44 to 58 (Hayes, 1995; Solinas-Toldo et 
al., 1995; Chowdhary et al., 1996; Iannuzzi et al., 1999; Schib¬ 
ler et al., 1998; Band et al., 2000). This increase in number is 
partly due to the mosaic organization of human/bovine chro¬ 
mosome synteny as for example between HSA11 and BTA15 
and 29. Indeed, painting of bovine chromosomes with HSA11 
revealed only two synteny groups (the entire BTA15 and 
BTA29) while comparison of HSA11 with the bovine genome 
displays a succession of ten segments conserved with BTA15 
and 29 in our work (Fig. 1). 

5 . Based on the data compiled in the Multispecies Compara¬ 
tive Table and on Fig. 1, it is clear that at this level of resolu¬ 
tion, the human genome is much more conserved with that of 
cattle (and pig) than with that of mouse and rat. However, some 
variation is observed among chromosomes. For example, 
HSA17 and HSA20 are entirely conserved on single chromo¬ 
somes in the four other species: HSA17/BTA19/MMU11/ 
RNO10/SSC12 and HSA20/BTA13/MMU2/RNO3/SSC17. In 
rare cases, the same synteny organisation relative to the human 
genome is found, as on HSA8p (see Fig. 1) with apparently an 
identical succession of conserved segments in cattle, mouse, rat 
i.e. BTA8/27, MMU8/14, RN016/15, which suggests an ances¬ 
tral genome organisation maintained in these species. At the 
other extreme, less than half of one of the smallest human chro¬ 
mosomes i.e. HSA22ql 1.21 ->ql2.3 shares conserved seg¬ 
ments with two bovine or pig chromosomes and with seven dif¬ 
ferent mouse or rat chromosomes revealing a complex pattern 
relative to HSA22 (Fig. 1). 

6. Seventy-eight synteny interruptions were found between 
human and bovine autosomes (including centromere interrup¬ 
tions). In general, their distribution does not coincide with R- 
positive and R-negative band limits except for some instances 
e.g., along HSAlp, where regions Ip36.33->p36.31, 
lp36.13 —>p35.1 and lp34.3 —>p 11.2 correspond respectively 
to parts of BTA16, BTA2 and BTA3. Of the 78 synteny inter¬ 
ruptions, 21 are also found in mouse and rat (e.g. see 
HSA12q23/q24.1, BTA5/17, MMU10C1/5F, RN07qll/ 
12ql6) suggesting that they probably occurred before rumi¬ 
nants and rodents diverged. 
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Fig. 1. Distribution and extent of bovine, mouse and rat synteny seg¬ 
ments conserved on human autosomes. R-banded human autosomes were 
taken from the human idiogram at approximately the 550-band level. Bovine 
(B), mouse (M) and rat (R) conserved segments are aligned along each human 
autosome and represented by color-coded bars with corresponding chromo¬ 
some numbers on the bars or next to the bars. The color-codes for each of the 
three species included in the figure i.e. bovine (BTA), mouse (MMU) and rat 
(RNO) are given at the bottom of the figure. Solid bars correspond to seg¬ 
ments confirmed by at least three loci mapped in two or more species (except 
in a few cases, see text). Unfilled bars correspond to conserved segments 
between human and bovine genomes deduced from extended comparisons 
with mouse, rat and pig mapping data. In a few cases, no bar is drawn 
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18119 
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because of lack of mapping information and comparative data among the 
species. Arrows indicate the two new HSA/BTA synteny groups revealed in 
this study. Pig was not included in Fig. 1 because the number of mapped loci 
shared by pig and cattle is relatively small (~ half of the total number of 
porcine mapped loci) and because bi-directional human/porcine chromo¬ 
some painting has been reported by Goureau et al. (1996). 
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Conclusions 

In this report, we present the first alignment of the bovine 
genome with all the human autosomes and with the genomes of 
mouse, rat and pig. It permits delineation of conserved synteny 
regions among these species and the direct and rapid tracking 
of a chromosomal region of interest of one species in the other 
four species. To answer questions on the conservation of gene 
order and on a more complete distribution of conserved synte¬ 
ny segments between man and cattle, a more accurate predic¬ 
tive tool is essential. For this, it will be necessary to identify, 


map and order many more genes in bovine chromosomal 
regions where correspondence with human chromosomes is 
poorly supported. Extensive radiation hybrid mapping of genes 
and the complete sequencing of the bovine and pig genomes 
will provide definitive answers in this direction. 
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Abstract. In this study, we present a comprehensive 3,000- 
rad radiation hybrid (RH) map of bovine chromosome 7 
(BTA7) with 108 markers including 54 genes or ESTs. For 52 of 
them, a human ortholog sequence was found either on HSA1 
(one gene), HSA5 (31 genes) or HSA19 (19 genes and one non- 
annotated sequence) confirming previously described synte- 
nies. Moreover, in order to refine boundaries of blocks of con¬ 
served synteny, nine new genes were mapped to the bovine 
genome on the basis of their localization on the human genome: 
six on BTA7 and originating from HSA1 (TRIM 17), HSA5 


(MAN2A1, LMNB1, SIAT8D and FLJ1159) and HSA19 
(VAV1), and the three others (AP3B1, APC and CCNG1) on 
BTA10. The available draft of the human genome sequence 
allowed us to present a detailed picture of the distribution of 
conserved synteny segments between man and cattle. Finally, 
the INRA bovine BAC library was screened for most of the 
BTA7 markers considered in this study to provide anchors for 
the bovine physical map. 

Copyright©2003 S. Karger AG, Basel 


Construction of precise comparative maps has long been 
considered as straightforward for the positional cloning of 
genes responsible for Economic Trait Foci in domestic animals 
(Georges and Anderson, 1996; O’Brien et al., 1999). Indeed, 
this approach profits from the extensive genome resources 
available in other species such as man and mouse. 

In cattle, the first large-scale comparative mapping studies 
consisted in FISH mapping of heterologous chromosome 
paints to characterize the distribution of conserved syntenies 
among human and bovine chromosomes (Hayes, 1995; Soli- 
nas-Toldo et al. 1995; Chowdhary et al., 1996). In particular, 
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for bovine chromosome 7 (BTA7), homologies with human 
chromosomes 5 (HSA5) and 19 (HSA19) were previously 
reported and confirmed somatic cell hybrid assignments of sev¬ 
eral genes from HSA5 or HSA19 on BTA7 (respectively Zhang 
et al. 1992; Gu et al., 1997). Nevertheless, no idea of the inter¬ 
nal structure of the identified conserved blocks was available at 
this stage and thus these two approaches, although very infor¬ 
mative, need to be refined. 

Radiation hybrid (RH) mapping techniques, proposed ear¬ 
lier by Goss and Harris (Goss and Harris, 1976) and successful¬ 
ly developed in man (Cox et al., 1990; Walter et al., 1994), 
mouse (Schmitt et al., 1996) and also in cattle (Womack et al., 
1997; Williams et al., 2002) have proven to be efficient in 
addressing this issue by the construction of detailed and 
ordered gene maps (for instance in cattle: Band et al., 2000; 
Gautier et al., 2002,2003). In 1999, Gu et al. published the first 
RH map of BTA7 containing 32 type II markers and seven 
genes. A more detailed map with 37 type I markers was more 
recently released by Band et al. (2000). Moreover, the authors 
described a new synteny between the BTA7 centromeric region 
and HSA1. In this study we attempted to develop a new BTA7 
RH map using the recently developed European hamster x 
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bovine radiation hybrid panel (Williams et al, 2002). In order 
to refine boundaries of blocks of conserved synteny, several 
new genes were mapped to the bovine genome on the basis of 
their localization on the human genome. In addition, the BAC 
library was screened for all the markers to provide anchors for 
the bovine physical map built in our laboratory (Schibler et al., 
2003). 

Materials and methods 

Primer pairs 

Primer pairs for the 27 genes or ESTs newly mapped on the bovine genome 
were designed from available bovine sequences stored in GenBank in regions 
sharing a high sequence similarity with the corresponding human gene. Addi¬ 
tionally, two new microsatellite markers: INRA253 (AY 195651) and 
INRA264 (AY195652) were included in the map. They were developed from 
BAC clones (respectively 0070B01 and 0195D10 which contain CRTL1) 
according to standard procedures as described in Vaiman et al. (1994). Other 
genes or anonymous markers were mapped using primers described in the 
literature (see BOVMAP database: http://locus.jouy.inra.fr). 

For two genes (DNMT1 and PAM) and one EST (T98797), primer pairs 
resulted in co-amplification of bovine and hamster DNA in radiation hybrid 
cell lines giving a fragment of the same size. To successfully map these three 
markers (respectively DNMT1, PAM and T98797), homologous primers 
were designed from one or both end sequences of a BAC containing the corre¬ 
sponding marker (respectively BAC 0655B09,0531H12 and0235C07 giving 
respectively the following couples of sequences: SP655B9 (BZ548319) and 
T655B9 (BZ548316), SP531H12 (BZ548317), and SP235C7 (BZ548320) 


and T235C7 (BZ548318)), using the strategy described in Gautier et al. 
(2001). Interestingly, a strong sequence homology was found for SP655B9 
and T655B9 with human sequences, respectively exon 36 of the DNMT1 
gene (86% on 198 bp, GenBank accession number: X63692) and exon 1 of 
the ICAM1 gene (83% on 79 bp, GenBank accession number: X57151). 

Descriptions and references of all the type I loci are given in Table 1. The 
BLAST software was used for sequence comparisons (http://www.ncbi. 
nlm.nih.gov/BLAST/). 

PCR conditions 

PCR reactions were performed on a PTC-100 thermocycler (MJ Re¬ 
search) in a 15 |il reaction volume with lx standard Taq polymerase buffer 
supplemented with 0.125 mM dNTP, 1.5 mM MgCh, 0.5 pM of each primer 
and 0.035 U/pl Taq polymerase (Promega, Madison, WI). Samples were pre¬ 
heated for 5 min at 94 0 C, subjected to 35 cycles (94 0 C for 20 s, 56 ° C for 30 s 
and 72 °C for 30 s), and a final extension step of 5 min at 72 °C. Primer 
references and sequences are given in Table 1. 

Bovine BAC identification 

A PCR based screening was performed on the 4-genome equivalent 
INRA bovine BAC library containing 105,984 clones as previously described 
(Eggen et al., 2001). 

Chromosomal assignments on the INRA hamster-bovine somatic cell 

hybrid panel 

The panel has been constructed by Heuertz and Hors-Cayla (1981) and is 
composed of a total of 38 hamster-bovine cell lines. A more complete 
description of the panel is given in Laurent et al. (2000). A correlation coeffi¬ 
cient of 0.69 was used as the threshold for confident assignment of a marker 
to a chromosome (Chevalet and Corpet, 1986). PCR based assignments were 
performed according to Laurent and coworkers (2000). 


Table 1 . Name and description of the 61 type I markers used in our survey. Fiftyfive map to the BTA07 RH3000 map (see text), three (AP3B1, HEXB and 
HMGCR) were assigned on BTA10 using the INRA somatic cell hybrid panel (see text). Additionally primer pairs for DNMT1, PAM and T98796 used to 
isolate BAC clones (respectively 655B09, 531H12 and 235C07) which were subsequently end-sequenced (see text) are also given. 


Locus 

Locus description (Name of the homologous 
gene in HUGO nomenclature) 

Accession 

number 

BTA 

HSA 

Forward primer 

Reverse primer 

TRIM 17 

Gene (tripartite motif-containing 17) 


7 

1 

TCTTCCTGGATTTTGAAGCTG 

CTATCCCTTCACCCACAGAGTC 

ADRB2 

Gene (adrenergic, beta-2-, receptor, surface) 

Z86037 

7 

5 

CTTCTCCCCAGTACCCTGCAA 

GCCCTTCAGACATAAACTTAACA 

AI461399 

(FLJ30060) 

Bovine EST (human hypothetical protein 
FLJ30060) 


7 

5 

BOVMAP 

BOVMAP 

AI461431 

(FBN2) 

Bovine EST (fibrillin 2 (congenital contractural 
arachnodactyly)) 


7 

5 

BOVMAP 

BOVMAP 

ARA54 

(RNF14) 

Bovine EST (ring finger protein 14) 


7 

5 

BOVMAP 

BOVMAP 

AW465535 

(MAN2A1) 

Bovine EST (mannosidase, alpha, class 2A, 
member 1) 

AW465535 

7 

5 

GCCATGAAACAAGCTAAGCAG 

TCGTCTTGCTTCTGTCAGCAT 

BB719 

(IL12B) 

Microsatellite {in interleukin 12B (natural killer 
cell stimulatory factor 2, cytotoxic lymphocyte 
maturation factor 2, p40)) 


7 

5 

BOVMAP 

BOVMAP 

BF603251 
(LMNB1) 

Bovine EST (lamin Bl) 

BF603251 

7 

5 

GTATGGTAATCCTTACCTACATG 

GGAAATTAAGGCCATCAGATTC 

BI680325 

(SIAT8D) 

Bovine EST (sialyltransferase 8D (alpha-2, 8- 
polysialyltransferase)) 

BI680325 

7 

5 

CTAGATGCTGAGCGAGATGTC 

TGACTGTCGATCTCTTTTCCACAC 

CAST 

Gene (calpastatin) 


7 

5 

BOVMAP 

BOVMAP 

CD14 

Gene (CD 14 antigen) 


7 

5 

BOVMAP 

BOVMAP 

CD74 

Gene (CD74 antigen) 


7 

5 

BOVMAP 

BOVMAP 

CLTB 

Gene (clathrin, light polypeptide Blight chain B) 

X04852 

7 

5 

AGGAGCAGCTGCTTTGGCCA 

CGTGAGGGAATGACGGGTGA 

CRTL1 

Gene (cartilage link protein) 

GI746405 

7 

5 

AGTATTCCCTTATTTTCCACGATTG 

TGCACAGACCCATCACTGAG 

CSF2 

Gene (colony stimulating factor 2 (granulocyte- 
macrophage)) 

U22385 

7 

5 

CCAGCCAGAAGTGGAAGCTT 

CACCTGTATCAGGGTCAACAT 

EST1096 

(SCA12) 

Bovine EST (spinocerebellar ataxia 12) 


7 

5 

BOVMAP 

BOVMAP 

F12 

Gene (coagulation factor XII (Hageman factor)) 


7 

5 

BOVMAP 

BOVMAP 

FGF1 

Gene (fibroblast growth factor, acidicendothelial 
growth factor) 


7 

5 

BOVMAP 

BOVMAP 

FLJ11159 

Gene (hypothetical protein FLJ11159) 

AI275272 

7 

5 

TTGCATTAAAGCTTCACAGACT 

TTCATGGCAGAGAGACGAGAT 


26 
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Table 1 (continued) 


Locus 

Locus description (Name of the homologous 
gene in HUGO nomenclature) 

Accession 

number 

BTA 

HSA 

Forward primer 

Reverse primer 

GPX3 

Gene (glutathione peroxidase 3 (plasma)) 


7 

5 

BOVMAP 

BOVMAP 

HNRPAO 

Bovine EST (heterogeneous nuclear 
ribonucleoprotein AO) 


7 

5 

BOVMAP 

BOVMAP 

HNRPAB 

Bovine EST (heterogeneous nuclear 
ribonucleoprotein A/B) 


7 

5 

BOVMAP 

BOVMAP 

IL3 

Gene (Interleukin 3 colony-stimulating factor, 
multiple) 

L31893 

7q15—q21 

5 

CTGAAGTCTGAAGCCCAGTT 

AGGGTCACACATATCAGGAAT 

IL4 

Gene (interleukin 4) 

U14159 

7q15—q21 

5 

GGGATACATTTCCTGCTCTCA 

CCAAGTTAGTGATGTGGCCAA 

IL5 

Gene (interleukin 5 colony-stimulating factor, 
eosinophil) 

U17053 

7q15-21 

5 

GCACAAGGGGATGCTGTGAA 

ACTTGCAGGTAGTCGAGGAA 

NR3C1 

Gene (glucocorticoid receptor alpha (nuclear 
receptor subfamily 3, group C, member 1)) 


7 

5 

BOVMAP 

BOVMAP 

PAM 

Gene (peptidylglycine alpha-amidating 
monooxygenase) 

M37721 

7 

5 

TTTGATAGCAAGTTTGTTTACCAG 

TGGACTGGAGTACTGCAGCAT 

PDE6A 

Gene (phosphodiesterase, cyclic GMP 
phosphodiesterase alpha subunit) 

M26043 

7 

5 

ACCCTTCACCGACGAGAGCA 

GCAGGTGAGTCTGGCACTAA 

RASA1 

Gene (RAS p21 protein activatorGTPase 
activating protein RAS p21) 


7q24-qter 

5 

BOVMAP 

BOVMAP 

RME30 

(CRSP9) 

Microsatellite (in cofactor required for Spl 
transcriptional activation, subunit 9, 33kDa) 


7 

5 

BOVMAP 

BOVMAP 

SP235C7 

STS in SP6 end sequence from BAC containing 
T98796 

BZ548320 

7 

5 

CTAATGCTCGCTGCAGGGAA 

GGTTCAGACACGAACATTCA 

SP531H12 

STS in SP6 end sequence from BAC containing 
PAM 

BZ548317 

7 

5 

GTGAGGAAAAGGAAGAACT GT 

CATGTCTTCCTGTACTAAGAAGT 

SPARC 

Gene (secreted protein, acidic, cysteine- 
richosteonectin) 


7 

5 

BOVMAP 

BOVMAP 

T235C7 

STS in T7 end sequence from BAC containing 
T98796 

BZ548318 

7 

5 

GACATGAGTGGAGAACCACTA 

CTCAGAACTGTTGAGGTTATCT 

T98796 

(MEF2C) 

Bovine EST (MADS box transcription enhancer 
factor 2, polypeptide C (myocyte enhancer factor 
2C)) 


7 

5 

BOVMAP 

BOVMAP 

AP3B1 

Gene (adaptor-related protein complex 3, beta 1 
subunit) 

BE236716 

10 

5 

GATTGGTTCTGTTCTGCTGCGA 

GTGGATGCCAGGGACTTGTGT 

APC 

Gene (adenomatosis polyposis coli) 

BE756741 

10 

5 

GAGAATAATTGCTTACAGACCT 

GAGAGATTCCACAAGGTTCCG 

CCNG1 

Gene (cyclin Gl) 

BE846245 

10 

5 

CACCTTGGGTGTGTTGGACTA 

AGGAGTGAGTAATAGAGCTGT 

BLVR 

(AP3B1) 

Gene (Bovine leukaemia virus, cell receptor 
homologous to the human adaptator-related 
protein complex 3, delta 1, subunit) 


7q25 

19 

BOVMAP 

BOVMAP 

COMP 

Gene (cartilage oligomeric matrix protein) 


7 

19 

BOVMAP 

BOVMAP 

DKZFP566B133 

(GADD45B) 

Bovine EST (growth arrest and DNA-damage- 
inducible, beta) 


7 

19 

BOVMAP 

BOVMAP 

DNASE2 

Gene (deoxyribonuclease II, lysosomal) 


7 

19 

BOVMAP 

BOVMAP 

DNMT1 

Gene (DNA (cytosine-5-)-methyltransferase 1) 

NM001379 

7 

19 

CACTGGTTCTGCGCTGGGACA 

CCTCCATGGCCCAGTTTTCGGA 

EPOR 

Gene (erythropoietin receptor) 

U61399 

7 

19 

CGGAACGCGCTACACCTTCAT 

ACGAGGGAGAGCGTCAGGAT 

EST1067 

(KIAA0876) 

Bovine EST (human hypothetical protein 
KIAA0876) 


7 

19 

BOVMAP 

BOVMAP 

EST1379 

(IL1RL1LG) 

Bovine EST (T1/ST2 receptor binding protein 
"putative") 


7 

19 

BOVMAP 

BOVMAP 

EST1394 

(R28550) 

Bovine EST (homologous to a non annotated 
human sequence AC005776) 


7 

19 

BOVMAP 

BOVMAP 

EST1925 

(OAZ1) 

Bovine EST (ornithine decarboxylase antizyme 1) 


7 

19 

BOVMAP 

BOVMAP 

ETR101 

Gene (transcription factor ETR101) 


7 

19 

BOVMAP 

BOVMAP 

ICAM3 

Gene (intercellular adhesion molecule 3) 


7 

19 

BOVMAP 

BOVMAP 

IFI30 

Gene (interferon, gamma-inducible protein 30 ) 


7 

19 

Bonsdorff et al. (2003) 

B 0 nsdorff et al. (2003) 

IL27W 

Gene (interleukin 27 working designation) 


7 

19 

Bonsdorff et al. (2003) 

B 0 nsdorff et al. (2003) 

INSL3 

Gene (insulin-like 3 (Leydig cell)) 


7 

19 

BOVMAP 

BOVMAP 

JUND 

Gene (jun D proto-oncogene) 


7 

19 

BOVMAP 

BOVMAP 

MAN2B1 

Gene (mannosidase, alpha B, lysosomal) 

U97686 

7 

19 

GTGTTACTGAGAAGACTGGCT 

GAGGGCAACATTATCTCCAGT 

RAB3A 

Gene (RAB3A, member RAS oncogene family) 


7 

19 

BOVMAP 

BOVMAP 

T655B9 

(ICAM1) 

STS in T7 end sequence from BAC containing 
DNMT1 (intercellular adhesion molecule 1 
(CD54), human rhinovirus receptor) 

BZ548316 

7 

19 

GCGGAAAACCTGTAGATCCT 

CTCTGCTGCCCGGTGAGTTA 

VAV1 

Gene (vav 1 oncogene [vav 1 proto-oncogene]) 


7 

19 

BOVMAP 

BOVMAP 

SP655B9 

(DNMT1) 

STS in SP6 end sequence from BAC containing 
DNMT1 (DNA (cytosine-5-)-methyltransferase 1) 

BZ548319 

7 

19 

GTGGTAGTTGTACCGCAGCT 

CTCCTAGTCGCCTCCAATGT 

EST0037 

Bovine EST (no significant homology found) 


7 

9 

• 

BOVMAP 

BOVMAP 

EST0196 

Bovine EST (no significant homology found) 


7 

9 

• 

BOVMAP 

BOVMAP 
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Bovine whole genome radiation hybrid panel genotyping and map 

construction 

PCR reactions were performed on the 94 radiation hybrid cell lines 
which constitute the newly developed 3,000-rad bovine panel using Wg3H as 
recipient (Williams et al., 2002). 

The Carthagene software (Schiex et al., 2003) was used to perform two 
point and multipoint analyses of the radiation hybrid data and to provide a 
comprehensive map of BTA7. The distances between markers on the most 
likely map were calculated using the RHMAP 3.0 software (Lange et al., 
1995) under the equal retention probability model. 

Results 

Chromosomal assignment of newly developed human genes 

in cattle 

Of the 27 new primer pairs (Table 1) developed in this study 
for Type I markers, 18 concern genes previously reported as 
mapping to BTA7 but for which either primer pairs failed to 
specifically amplify bovine DNA or were not available (see 
BOVMAP database). The nine remaining primer pairs repre¬ 
sent new genes assigned on the bovine genome using the bovine 
INRA somatic hybrid cell panel (Laurent et al., 2000). 

More precisely, six genes mapped to bovine BTA7 and were 
integrated to the radiation hybrid map (see below): one 
(TRIM 17) originates from HSA1, four (AW465535, 
BF603251, BI680325 and FLJ1159) from HSA5 and one 
(VAY1) from HSA19. The three others (APC, AP3B1 and 
CCNG1) originate from HSA5 but map to BTA10. This latter 
result confirms a group of conserved synteny between these two 
species, which until now was supported by only one gene 
(HMGCR). 

These new assignments were helpful in refining the bound¬ 
aries of blocks of conserved synteny between BTA7 and HSA5 
(see below and Fig. 1C). 

Construction of a comprehensive BTA 7 RH3000 map and 

comparison with previous maps 

The BTA7 radiation hybrid map contains 108 markers (in¬ 
cluding 54 genes, see below) and is 2,739 cRay 3 ooo long 
(Fig. ID). Among these 108 markers, 35 are also present in the 
USDA genetic map (Kappes et al., 1997), which contains 45 
markers, and 13 are present in the IBRP genetic map (Barendse 
et al., 1997), which contains 14 markers. The order on the map 
presented here is identical to the IBRP map and very similar to 
the USDA genetic map. 

As previously described for BTA15 (Gautier et al., 2002) 
and BTA26 (Gautier et al., 2003), we have calculated the trans¬ 
forming ratio from cRaysooo (CR3000) to cM using the two most 
distant markers common to our map and to these two genetic 
maps (Gautier et al., 2002). Similar results were obtained in 
both cases and were remarkably consistent with those obtained 
for the BTA15 (Gautier et al., 2002) and for the BTA26 (Gau¬ 
tier et al., 2003) RH maps: 1 CR3000 to 0.0508 cM for the USDA 
map (0.047 for BTA15 and 0.048 for BTA26) and 1 CR3000 to 
0.0500 for the IBRP map (0.052 for BTA15 and 0.037 for 
BTA26). 

In addition, three other radiation hybrid maps have already 
been developed for BTA7. Two (Gu et al., 1999; Band et al., 
2000) were constructed using the same 5,000-rad bovine RH 


panel (Womack et al., 1997). The third was developed during 
the characterization of the 3,000-rad panel used in this study 
(Williams et al., 2002). The map by Gu and coworkers (1999) 
contains 32 markers (including seven genes) and is 960 cR 50 oo 
long (versus 2,739 CR3000 in our map). The region spanning 
from BM7160 to BL1043 is the same as that of our map and the 
order is very similar. The map by Band and coworkers (2000) 
contains 53 markers (39 type I markers) and is unexpectedly 
half the size (457 cR 50 oo) of that of Gu et al. (1999) although the 
map spans a smaller interval from EST0037 to PAM (equiva¬ 
lent to SP531H12 on our map). 42 markers are common with 
our map but their order shows more pronounced differences. 
Finally, the map by Williams et al. (2002) contains 51 markers 
for a total length of 2,215 cR 3 ooo- Our map corresponds to this 
previous map with the addition of 57 new markers (essentially 
type I markers for comparative mapping purposes). A two-fold 
increase of the number of markers resulted only in increasing 
the length by 30%. Thus, we may be close to the optimal limit 
of saturation of our map regarding the resolution of our panel 
(all the markers being inside a single linkage group at the Lod 
2pt = 4.0 threshold). 

Together, these different results appear to reveal a higher 
resolution than expected for our panel. It seems to have even 
more resolution than the 5,000-rad map, when considering the 
number of markers and the corresponding map length. 

Comparative map construction 

Of the 108 BTA7 markers included on the map, 55 (50%) 
are associated with 54 different type I markers since two mark¬ 
ers (SP235C7 and T235C7) are related to the same EST 
(T98796) (see Materials and methods). 51 of these 54 (93 %) are 
orthologs of human genes precisely mapped to the human 
genome: one on HSA1, 31 on HSA5 and 19 on HSA19 (see 
Table 1). For the remaining three (EST 1394, EST0037 and 
EST0196 in the order of the map), no significant homology 
could be found with any annotated human sequence. Neverthe¬ 
less, a strong sequence homology (415 bp, 80%) could be found 
between EST 1394 and an anonymous sequence included in a 
human cosmid (AC005776) localized on HSA19. Thus it might 
correspond to a non-annotated human gene. 

Finally, taking the human draft sequence assembly released 
in April 2003 available in the UCSC database (http://genome. 
ucsc.edu/) as a reference we could draw a detailed comparative 
map of HSA1 (Fig. 1A), HSA5 (Fig. 1C) HSA19 (Fig. IB) ver¬ 
sus BTA7 (Fig. ID). We have also reported in Fig. 1 seven 


Fig. 1. Comparative map between the radiation hybrid map of BTA7 (D) 
and the draft sequence assemblies of HSA1 (A), HSA5 (C) and HSA19 (B). 
Physical distances on the human maps (A, B and C) are from UCSC (April 
2003, http://genome.ucsc.edu/). In (D) type I markers are in bold characters 
and markers sharing the same bovine BAC address are indicated with brack¬ 
ets. Curved arrows on the bovine RH map point to inversions in the gene 
order as compared to human (left side of the corresponding bovine map). 
The lengths of each block of conserved synteny identified as SI to S9 in Fig. 1 
are given in Table 2. To provide a more precise view of their boundaries 
seven gene localizations (three from this study and four from previous 
reports) are indicated on the left side of HSA19 (B) and HSA5 (C) maps. 
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Table 2. Description of blocks of conserved 

synteny from the comparative map between Bloc °f conserve d Marker interval HSA HSA length BTA length Converting ratio 

BTA7, HSA1, HSA5 and HSA19 s y nten y < kb ) (cRay 30 oo) (kbxRay^ 1 ) 
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S2 
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19 
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32.4 

S3 
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19 
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8.50 

S4 

MAN2B1 /EST1394 

19 
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193 

13.5 

S5 
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19 
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49 

19.7 

S6 

CRTL1 / MAN2A1 

5 

26260 
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HSA5 or HSA 19 genes mapping to bovine chromosomes 
BTA 10, BTA 18 or BTA20 thus refining the boundaries of 
blocks of conserved synteny: 

- from HSA5: HEXB and LCP2 mapped to BTA20 (see BOV- 
MAP database) and HMGCR (see BOVMAP) together with 
the three newly assigned genes (AP3B1, APC and CCNG1) 
mapping to BTA 10; 

- from HSA19: GPI mapping to BTA18 (see BOVMAP). 
BAC library screening 

Except for RAB3A, IDVGA90 and INRA053, which failed 
to amplify bovine BAC superpools and INRA229 and 
BM7160, which seem to be absent from our library, at least one 
address could be obtained for each of the markers (data and 
clones are available upon request). Regarding the bovine ge¬ 
nome coverage of our library we expect a probability of having 
no clone for a given marker to be 1.5 % (Eggen et al., 2001). The 
results for the screening of BTA7 markers are thus consistent 
since only two markers (1.9 %) out of the 106 for which PCR 
amplification was successful seem to be absent. 

Additionally, six addresses were found to be in common 
with several closely related markers on the BTA7 RH map 
(Fig. 1C): 

- 0058H05 contains IL4, IL4MS and BOBT24. This interval 
spans 7 cR 30 oo on our RH map 

- 0035G02 contains CSF2 and IL3, which are distant from 
6.3 CR 3000 on our RH map 

- 0235C07 contains T235C07, BM2209 and SP235C07. This 
interval spans 3.7 CR3000 on our RH map 

- 0561C 10 contains BMS522 and AI461399, which are dis¬ 
tant from 4.2 CR3000 on our RH map 

- 0655B09 contains ICAM1 (T655B09), IDVGA62A and 
DNMT1 (SP655B09). This interval spans 16.1 CR3000 on 
our RH map 

- 0852D03 contains BP41 and BL1067, which are distant 
from 15.5 CR3000 on our RH map 

These results are in perfect agreement with those obtained 
for BTA 15 and BTA26 (Gautier et al., 2002, 2003) allowing us 
to propose similar conclusions concerning the resolution of our 
panel on the level of the average BAC size in our library (about 
120 kb). 


Discussion 

For 52 type I markers included in our BTA7 map, related 
orthologous sequences could be found on the human genome 
sequence assembly (see Fig. 1). Moreover all of them map to 
HSA1, HSA5 or HSA 19 confirming previous comparative 
mapping results. Nevertheless, Band and coworkers (2000) 
mapped CCT8 (AW244896) on BTA7, which is localized on 
HSA 14. This synteny could not be confirmed in our study since 
we were not able to integrate CCT8 in our map (primer pairs 
failed to amplify properly). However, as the coverage of our 
comparative map is relatively homogeneous, this synteny may 
result from a discrepancy caused by paralogy. 

Interestingly, radiation hybrid mapping provides informa¬ 
tion about the order of genes inside the block of conserved syn¬ 
teny. Nine conserved syntenies were identifiable in our study: - 
one segment between HSA1 and the centromeric region of 
BTA7 (SI on Fig. 1). As described in Hayes et al. (personal 
communication), the localization of TRIM 17 confirms the 
existence of SI previously supported by only one gene (GUK1; 
Band et al., 2000); - four segments of conserved synteny 
between BTA7 and HSA5 (from S6 to S9 on Fig. 1); - four seg¬ 
ments of conserved synteny between BTA7 and HSA 19 (from 
S2 to S5 on Fig. 1). 

For the eight segments containing more than one gene, six 
(S2, S4, S6, S7, S8 and S9) have a gene order strictly conserved 
between the BTA7 bovine RH map and the human genome 
sequence assembly. The two other segments (S3 and S5) exhibit 
only one rearrangement between closely related genes: it might 
be the result of a small discrepancy in our map. In particular, 
the rearrangement in S3 consists in an inversion between 
ICAM1 and DNMT1 which are localized in the same BAC 
clone. 

Thus, as previously described, these data allowed us to cal¬ 
culate the converting ratio between radiation hybrid distance in 
CR 3000 from our BTA7 RH map and the physical distances in kb 
taken from the human genome assembly (details for each block 
are given in Table 2). 

Blocks from S2 to S5, all mapping to HSA19, are shorter 
than blocks from S6 to S9 (mapping to HSA5). Values for the 
converting ratio range from 8.50 (S2) to 66.8 (S5) and those for 
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blocks SI to S4 (ranging from 8.50 to 32.4) are smaller than 
those for blocks S5 to S8 (from 26.4 to 66.8). These differences 
between these two groups of blocks of conserved synteny seem 
to be related to their relative lengths (HSA19 blocks tend to be 
smaller than the HSA5 blocks: on average respectively 135 
cR.3000 versus 310 cR 30 oo)- We then calculate a global converting 
ratio consisting of the average of the eight individual ratios (for 
S2 to S9) weighted by the length of each block on the BTA7 RH 
map in order to minimize the differences due to their size vari¬ 
ation. This resulted in a ratio of 36.7 (1 cR 3 ooo for 36.7 kb), 
which is in perfect agreement with results obtained for BTA15 
(1 cR 30 oo for 35.4 kb in Gautier et al., 2002) and BTA26 (1 
cR 3 ooo for 29 kb in Gautier et al., 2003) using the same proce¬ 
dure. It should also be noticed that these ratios are concordant 
with those based on genetic distances. Indeed, we previously 
found (see above) that 1 cR 3 ooo corresponds to about 0.05 cM 
(previous results for BTA15 and BTA26 lead to a similar ratio). 
Assuming that 1 cM corresponds to about 1 Mb (in a first 
approximation we neglect cold and hot spots since we are con¬ 
sidering the entire chromosome), this gives a ratio 1 cR 3 ooo for 
50 kb. These two independent methods of estimating corre¬ 
spondence between radiation hybrid distance and physical dis¬ 


tance are thus concordant although their calculation is based on 
strong hypothesis. Moreover, as mentioned earlier our panel 
seems to have far more resolution than expected regarding the 
relatively low radiation dose used for its construction. 

In addition, using this global converting ratio we estimate 
the total length of BTA7 to be about 101 Mb (= 0.0367 x 
2,739). Popescu et al. (1996) reported that this chromosome 
represents 4.18 % of the haploid bovine genome. Assuming that 
the bovine genome measures 3,000 Mb this figure corresponds 
to a size of about 125 Mb for BTA7. A non-complete marker 
saturation of the radiation hybrid map could explain this minor 
difference between these two estimations. 

Finally, this map represents a powerful tool to assist map¬ 
ping of genes underlying traits of economic interest. In particu¬ 
lar, it will permit direct anchoring of location intervals on the 
human genome assembly thus benefiting from the extensive 
functional information. Moreover, all the markers were 
screened for in our bovine BAC library providing anchors for 
the physical map of the bovine genome (Schibler et al., 2003) 
improving positional cloning strategies and initiating the whole 
genome sequencing of the bovine genome. 
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Abstract. We have constructed a medium density physical 
map of bovine chromosome 19 using a combination of map¬ 
ping loci on both a bovine bacterial artificial chromosome 
(BAC) scaffold map and a whole genome radiation hybrid 
(WGRH) panel. The resulting map contains 70 loci spanning 
the length of bovine chromosome 19. Three contiguous groups 
of BACs were identified on the basis of multiple loci mapping 


to individual BAC clones. Bovine chromosome 19 was found in 
this study to be comprised almost entirely from regions of 
human chromosome 17, with a small region putatively assigned 
to human chromosome 10. Fourteen breakpoints between the 
bovine and human chromosomes were detected, with a possi¬ 
bility of five more based on ordering of the WGRH map. 

Copyright©2003 S. Karger AG, Basel 


Comparative maps can be powerful tools with multiple 
applications in genome annotation, gene discovery and evolu¬ 
tionary genomics. An international effort to construct a Bacte¬ 
rial Artificial Chromosome (BAC) scaffold map of the bovine 
genome (http://www.bcgsc.ca/lab/mapping/bovine) provides 
an opportunity to very precisely place genes relative to one 
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another in the bovine. Further to this, coupled with methodolo¬ 
gies such as Whole Genome Radiation Hybrid (WGRH) map¬ 
ping and using existing linkage maps, the BAC scaffold data 
can pinpoint evolutionary breakpoints between bovine and 
more intensively studied genomes such as human with great 
precision. The precise number and location of such breakpoints 
is not only of interest in understanding evolutionary relation¬ 
ships between species, but it is critical to the utilisation of com¬ 
parative maps in gene discovery experiments, such as in the 
identification of candidate genes underlying QTL. 

Bovine chromosome 19 (BTA19) was chosen as a candidate 
chromosome for the application of this approach due to the rel¬ 
ative large numbers of gene sequences recently mapped on this 
chromosome by Stone et al. (2002). This provided 38 gene loci 
in addition to previously mapped genes and DNA markers on 
the bovine linkage map (Kappes et al., 1997). A large number of 
genes and markers from BTA19 were also recently mapped on a 
WGRH map by Williams et al. (2002). 
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In this study, we have used overgo hybridization to map 
multiple genes and DNA marker loci on BTA19. A total of 70 
genes and DNA markers have been placed on the BAC scaffold 
map. Of these 70 loci, 43 were successfully mapped on the 
WGRH map (Williams et ah, 2002), 49 loci were mapped to the 
linkage map (Kappes et ah, 1997; Stone et al., 2002) and 28 loci 
were mapped to both the linkage and WGRH map. The 
WGRH and linkage mapping results together with the BAC 
mapping results and information from the human draft se¬ 
quence were used to assign a putative order of loci along 
BTA19 and thus deduce the number and position of evolution¬ 
ary breakpoints between bovine and human genomes. 

Materials and methods 

Bovine BAC libraries 

The CHORI-240 bovine BAC library, developed by Chung Li Shu and 
Kazutoyo Osoegawa and distributed by the BACPAC Resource Center at the 
Children’s Hospital Oakland Research Institute (USA; (http://bacpac.cho- 
ri.org/home.htm), was used for the majority of this study. This library was 
prepared from genomic DNA isolated from white blood cells of a Hereford 
bull, LI Domino 99375 (USDA-ARS, Miles City, MT), and partially 
digested with Mbol. The DNA was then size selected and cloned into the 
pTARBAC1.3 vector between BamHl sites. Ligation products were trans¬ 
formed into DH10B electrocompetent cells (BRL Life Technologies) for seg¬ 
ment 1 and DH10B T1 phage resistant electrocompetent cells (BRL Life 
Technologies) for segment 2, respectively. Segments 1 and 2 consist of 
approximately 102,252 and 89,483 clones, respectively, and combine to pro¬ 
vide approximately 10.7 fold coverage of the bovine genome. 

The RPCI-42 bovine BAC library was used to map PHB, KRT19, GAS, 
GFAP and MAPT and to provide additional clone assignments for the genes 
THRA, IGFBP4, MTMR4 and GH1. This library was also constructed by 
Chung-Li Shu and is available through the BACPAC Resource Center (http: 
//bacpac.chori.org/). White blood cell DNA from a Holstein bull was isolated 
to construct this library. This DNA was partially digested with a combination 
of EcoRl and EcoRl Methylase. Size selected DNA was then cloned into the 
pBACe3.6 vector (Frengen et al., 1999) between the TcoRI sites. The ligation 
products were transformed into DH10B electrocompetent cells (BRL Life 
Technologies). Segments 1 and 2 of the RPCI-42 library consist of approxi¬ 
mately 108,776 and 107,663 clones, respectively, and combine to provide 
11.9-fold coverage of the bovine genome (Warren et al., 2000). 

The Texas A & M bovine BAC library was used to map BM9202, 
HEL10, NF1, CRYB1, TP53, CHRNB1, P4HB, BMC1013 and 5BMS and 
to provide additional clone assignments for ILSTS14 and GH1. Construc¬ 
tion of the Texas A&M BAC library was originally described in Cai et al. 
(1995). Briefly, genomic DNA from Angus bull TAMU Shoshone Y6 
(11519666) was partially digested with Hindlll, size selected, cloned into 
pBeloBACll vector and transformed into DH10B electrocompetent cells 
(BRL Life Technologies). A total of 45,000 Angus BAC clones from this libra¬ 
ry were contributed to the international BAC mapping effort, including all 
clones with associated mapping data. BAC DNA from this library was pooled 
for PCR-based screening with three rounds of PCR yielding the address of a 
positive clone. First, 71 superpools containing DNA from 12 plates of 96 
BAC clones were screened. Second, twelve single pools, representing DNA 
from each of the twelve plates in the positive superpool were screened. Final¬ 
ly, the pooled row and column DNA from the positive plate were screened 
with the intersection of the row and column identifying the location of the 
positive BAC. PCR products generated from positive BAC clones were veri¬ 
fied by sequencing prior to mapping. 

Oligonucleotide probe and PCR primer design 

Oligonucleotide (Overgo) probes and PCR primers for putative BTA19 
genes and markers were designed using the Overgo 1.02i web-based program 
(Cai et al., 1998; http://www.mouse-genome.bcm.tmc.edu/webovergo/Over- 
golnput.asp). When possible, oligonucleotides were designed and then a sin¬ 
gle PCR primer was designed to match either the forward or reverse oligonu¬ 
cleotide. Oligonucleotides for the probes and primers were purchased from 


Qiagen Inc.’s Operon Technologies and were synthesized at a 50 nmol scale, 
dissolved in sterile, deionized H 2 O to a concentration of 100 pmol and then 
diluted with sterile, deionized water to a working concentration of 10 pmol. 
Probe pairs and primers were stored at 4 0 C. 

Oligonucleotide probe labeling 

Oligonucleotide probe pairs were labeled in separate 0.5-ml Eppendorf 
tubes with 6.8 ul of freshly prepared master mix, consisting of 0.2 pi dATP 
(2.5 mM), 0.1 pi BSA (10 mg/ml) 4.0 pi of OLB (see http://www.tree.cal- 
tech.edu/protocols/overgo.html for composition), 2.5 pi of sterile H 2 0, 1.0 pi 
of the combined Overgo oligonucleotides (10 pmol/pl), 1.2 pi of fresh 32 P- 
dCTP (2.5 pCi) and 1 pi of Klenow enzyme (2 units/pl; Roche) for a final 
reaction volume of 10 pi. Labeling reactions were carried out in a 37 0 C water 
bath for 1 h. Initially, probes were cleaned using the Qiagen Nucleotide 
Removal Kit (Cat. #28304) following the manufacturer’s instructions. How¬ 
ever, this proved to be unnecessary and subsequent hybridizations were per¬ 
formed using the “dirty” probes. Probes were arranged in their individual 
tubes in a row and column grid fashion to facilitate systematic pooling for 
subsequent hybridizations and final probe/clone matching. A nematode BAC 
clone was spotted in diagonal duplicate pattern on each corner of six panels 
of the high density filters for the RPCI-42 and CHORI-240 libraries to facili¬ 
tate alignment of the filters with a grid pattern after probe hybridization 
(http://bacpac.chori.org/anchors.htm). An oligonucleotide pair correspond¬ 
ing to the nematode BAC with sequences 5'-GTTGCCAAATTCCGA- 
GATCTTGGC-3' and 5 '-AT CAT GT GGCTTC GT CGCC A AG AT - 3' was 
labeled and included as a probe for grid alignment. 

Filter hybridization 

Three BAC filters were separated by acrylic mesh, rolled together so that 
one third of each overlapped with another filter and inserted into a glass 
hybridization cylinder. Filters were pre-wet using 2x SSC and then pre¬ 
hybridized for a minimum of 3 h in 25 ml of hybridization buffer (1 % BSA, 
1 mM EDTA, 7% SDS and 0.5 M sodium phosphate) at 58 °C. The pooled 
probes (5 pi of each probe for the first hybridization; 2.5 pi of the relevant 
probes for each of the second and third hybridizations) along with salmon 
sperm DNA (7.7 pi of 10 pg/ml per 5 ml of hybridization buffer), E. coli 
DNA (5 pi of 5 pg/ml per 5 ml of hybridization buffer) and C 0 t DNA (10 pi of 
0.65 pg/ml per 5 ml of hybridization buffer) were then denatured at 98 0 C for 
10 min prior and immediately transferred to the hybridization cylinder con¬ 
taining the BAC filters. Hybridization was carried out overnight at 58°C. 
After hybridization, the filters were washed in 2x SSC, 0.1 % SDS at room 
temperature for 30 min, followed by 1.5x SSC, 0.1% SDS and 0.2 x SSC, 
0.1 % SDS for 45 min each at 65 0 C. If counts, as checked by Geiger counter, 
were still high (above 3000 counts/min), the filters were washed once more in 
0.2x SSC, 0.1 % SDS for 45 min at 65°C. Normally filters averaged 1000 
counts/min. The filters were then wrapped in plastic wrap and exposed over¬ 
night at -80 °C to Kodak X-ray film (X-OMAT AR) using an intensifying 
screen. The radioactivity on the filters was allowed to decay and filters were 
reused after washing to remove the previous probes. 

Positive clone identification and secondary and tertiary screening 

The library addresses of positive clones were identified by aligning the 
transparent overlay provided by BACPAC Resources via the anchor gene 
spots according to the instructions accompanying the filters. Clones were 
then ordered from BACPAC Resources. Once the stab cultures were 
received, they were streaked on 1.5 % agar in LB medium supplemented with 
chloramphenicol (20 mg/ml in ethanol) and grown overnight at 37 0 C. Clones 
were then picked, grown over night at 37 0 C in LB medium containing chlor¬ 
amphenicol (20 mg/ml in ethanol) in a shaking incubator, spotted onto nylon 
filters using the high density replicating tool of the Biomek 2000 robotic sys¬ 
tem (Beckman), and the filters were placed on top of 1.5% agar in LB 
medium and allowed to grow once more overnight at 37 °C. The colonies 
were lysed using standard procedures and crosslinked on the filters using UV 
light. Individual filters were then probed with one of the possible row or 
column combinations of probes in such a way that each row or column pool 
was hybridized to an individual filter. Hybridizations were carried out as 
described for the previous BAC filter hybridization. The hybridization 
results from each column and row were then compared and clones found in 
both a row and column were assigned to the probe found at the intersection 
of that row and column in the original probe grid arrangement. More detailed 
descriptions of the pooling-hybridizations procedure used to identify probe- 
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clone combinations can be found in Cai et al. (1998) and Han et al. (2000). A 
third hybridization was performed as outlined above using filters spotted 
with clones theoretically corresponding to a single probe. Only the individual 
probes rather than pools were used in a particular hybridization cylinder in 
this round. 

PCR confirmation of the probe assignments 

PCRs were run for further confirmation of the identity of individual 
clones. A 10 pi reaction volume was used consisting of 1 pi of a 1:100 dilution 
of boiled BAG culture, 0.5 pi of both the forward and reverse PCR primers, 
1 pi 10x buffer, 1 pi dNTPs (2.5 mM), 1 pi MgCl 2 (25 mM), 1 pi of AmpliTaq 
DNA polymerase (1 U; Roche) and 4 ul of H 2 0. Samples were amplified in a 
GeneAmp 9700 thermocycler (Applied Biosystems). Samples were initially 
denatured for 2 min at 94 ° C, followed by 35 cycles of denaturing at 94 0 C for 
30 s, annealing at the primer specific annealing temperature for 30 s and 
extension at 72 °C for 30 s. A final extension phase of 4 min at 72 °C was 
used. Samples were then cooled and held at 4 0 C until they were run on a 4 % 
agarose gel for verification of product sizes. 

Whole genome radiation hybrid mapping 

Genes and markers were mapped using the Roslin Institute’s bovine 
whole genome radiation hybrid (WGRH) panel (Williams et al., 2002). Brief¬ 
ly, PCR primers were synthesized using published sequences. The genes and 
markers were typed on DNA from 94 radiation hybrid lines together with 
control bovine and hamster DNA by PCR in 96-well microtitre plates. PCR 
reactions of 20 pi DNA, 5.7 % sucrose, 42.25 ng/pl cresol red, 0. ImM dNTPs, 
0.25 pM primers, 0.45 U Taq polymerase, and between 1 and 3 mM MgCl 2 , 
optimized for individual primers, were used. A touch-down PCR method 
was run consisting of 3 min at 94 °C, followed by 14 cycles of 30 s at 93 °C, 
30 s at an annealing temperature starting at 7 ° C above the optimized tem¬ 
perature for the primers and decreasing by 0.5°C per cycle, followed by 1 
min at 72 0 C, then 25 cycles of 30 s at 93 0 C, 30 s at the annealing tempera¬ 
ture for the primers and increasing by 1 s per cycle, followed by 1 min at 
72 °C. The PCR was ended with a 5 min extension phase at 72 °C. All reac¬ 
tions were run in duplicate and the presence or absence of a PCR product was 
determined using 96-well 2.5% mini-agarose gel electrophoresis and ethid- 
ium bromide staining. Gels were scored twice by separate individuals. PCRs 
and gels were repeated if there was a significant variation in results between 
duplicates or individual scorers. The Carthagene software package (Schiex et 
al., 2001; www.inra.fr/bia/T/CarthaGene/) was used to produce the map of 
the chromosome. 

Analysis of results 

Gene or marker probes that were positively assigned to particular clones 
were placed onto the bovine BAC physical map being constructed as part of 
an international effort at the Genome Sciences Centre (GSC) of the BC Can¬ 
cer Agency in Vancouver, Canada, (http://www.bcgsc.ca/lab/mapping/bo- 
vine). The location of individual genes in the human genome was determined 
based on the human draft sequence and the results were compared to the 
putative location of the gene based on the bovine linkage map (Kappes et al., 
1997; Stone et al., 2002), whole genome radiation hybrid map (Williams et 
al., 2002) and human-bovine comparative map (Band et al., 2000). 


Results and discussion 

The identification of positional candidate genes for eco¬ 
nomically important traits in cattle requires the availability of 
high-density marker and gene maps and, given the wealth of 
information available for humans, detailed comparative maps. 
In addition, successful QTL mapping and candidate gene iden¬ 
tification requires that the gene/marker order on available 
maps be correct. Constant refinement is therefore necessary. 
Given the wealth of marker and gene information for BTA19 
(the bovine-human comparative map (Band et al., 2000), the 
bovine linkage map (Kappes et al., 1997), EST mapping infor¬ 
mation (Stone et al., 2002) and whole genome radiation hybrid 


map (Williams et al., 2002) and the fact that several QTL of 
interest have been localized to this chromosome (Kneeland et 
al., 2003; Li et al, 2003), the opportunity to use available BAC 
libraries in combination with fingerprinting and mapping data 
to examine and perhaps refine gene order on this chromosome 
and to study chromosomal rearrangements and breakpoints 
with respect to human was ideal. 

The overgo hybridization procedure provides a very effec¬ 
tive and efficient method of screening large BAC libraries like 
the ones used in this study. This procedure used here was first 
developed and described by Cai et al. (1998) and used to screen 
a mouse BAC library. Not only did this work develop software 
that can extract unique sequences for oligonucleotide probes 
from publicly available sequence data, but it also provided a 
multiplex oligonucleotide hybridization strategy that reduces 
the amount of time required for large-scale screening of high 
density BAC filters. Using the approach of Ross et al. (1999), 
larger than normal probes, termed overgos, are produced that 
are characterized by higher specificity labeling and better 
hybridization kinetics than conventional probes. Arraying the 
probes in a row and column scheme and using the pooling strat¬ 
egy that is central to the procedure, allowed us to rapidly and 
reliably identify a large number of clones from many different 
loci. 

Of the 71 overgo probes used to screen the CHORI-240 
library, 59 resulted in positive identification of one or more 
BAC clones. In addition, data on 11 loci (eight unique) were 
mapped on the Texas A&M BAC library and a further nine loci 
(five unique) were mapped on the RPCI-42 library, bringing 
the total loci mapped on the BAC scaffold to 72 (Table 1, 
Pig. 1). Clone assignments may be viewed on http://www.afns. 
ualberta.ca/Hosted/Bovine_Genomics/Index.asp. 

Three BAC contigs were identified on the basis of multiple 
loci mapping to individual BAC clones (Table 2.) 

Mapping results from the linkage and WGRH maps are dis¬ 
played in Pig. 1. A total of 43 loci were successfully mapped on 
the 5000 rad WGRH map, 49 loci on the linkage map and 28 
loci on both maps. The order of loci was determined based on 
the WGRH map and by minimizing the putative breakpoints 
between human and bovine. On a finer scale, order within BAC 
contigs could be deduced by the location of BAC clones posi¬ 
tive to a particular gene within the contig. The order displayed 
in Pig. 1, resulted in five conflicts (crossovers) in the WGRH 
map but multiple conflicts in order in the linkage map (Fig. 1). 

A total of 14 breakpoints between bovine and human were 
determined based on the human draft sequence and the combi¬ 
nation of the bovine WGRH and BAC scaffold maps (Fig. 1). 
This is double the number identified by Band et al. (2000) who 
based their comparative maps on bovine and human WGRH 
maps. All the breakpoints were due to rearrangements between 
BTA19 and HSA17 except in the case of locus MGC2491 which 
maps to HSA10. This is in agreement with the linkage mapping 
results of Stone et al. (2002), however both in the present study 
and the study of Stone et al. (2002) the assignments are based 
on relatively short sequences. The clones positive for the 
MGC2491 locus should therefore be sequenced further to con¬ 
firm the assignment. Indeed MGC2491 falls into a contig of 
five loci that defines an internal breakpoint including the 
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Fig. 1. Comparison of BTA19 gene/marker order generated through overgo hybridization with that from the bovine linkage map, 
whole genome radiation hybrid map and human draft sequence. Position on the human draft sequence is indicated on the right hand 
side, and direction is indicated by the arrows. 
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Table 1 . Oligonucleotide and primer sequences used for Overgo probe construction and results verification 


Symbol 

Alias 

ILSTS73 

ILSTS73 

ACACA 

SMJW6 

BM6000 

BM6000 

FLJ20739 

SMJW31 

BMC5012 

BMC5012 

RPL23A 

RPL23A 

SUPT6H 

RSJW14 

KIAA0732 

G73119 

LOC51003 

SMJW40 

BMS1920 

BMS1920 

KIAA0909 

G73133 

SLC25A11 

SLC25A11 

PSMB6 

G67669 

MGC2491 

G73121 

ALOX12 

ALOX12 

HSPC002 

RSJW228 

STRA12 

STRA12 

SLCA4 

GLUT4 

PLSCR3 

G73148 

FXR2 

G73131 

MUM2 

G73126 

RM388 

RM388 

IDVGA46 

IDVGA46 

BMS2503 

BMS2503 

BMS2389 

BMS2389 

FLJ20308 

RSJW256 

B9 

SMJW27 

G73139 

SMJW35 

COL1A1 

FLJ20260 

BMS650 

BMS650 

OarFCB193 

OarFCB193 

G73147 

SMJW36 

FLJ13660 

SMJW28 

G73129 

SMJW34 

FLJ20260 

SMJW30 

(G73124) 


MLN64 

SMJW42 

TCAP 

SMJW46 

THRA 

THRA 

RARA 

RARA 

IGFBP4 

SMJW7 

ETH12 

ETH12 

KRT10 

KRT10 

ATP6N1A/ 

SMJW26 

ATP6V0A1 


RM186 

RM186 

ITGA2B 

RSJW194 

ADAM 11 

SMJW25 


Oligonucleotide A (5-3') 


ATA TGG TTT TGA TTG TGA CCT AGC 
TGG TGT CTG CTT ACC CAG CAA GAA 

CCT ATA CAC TGA CAG GTA TAG ACT C 
GAC TCT CTC ACC CGG CTC TCA TC 
CCC TGG GAG ACC ACA GGC AGA GC 
CGG AAG AGC GCG CCT AGG AGA AAC 
GCA GCA GCA GCC AAA GTG AGT AC 

GGG CTC AGC TCC AAT TCT TCG TGT T 

GAG GGT GAC AAC ACC ATC CCG GC 

CCA TGG GCT GTC TAC CGA AAT TC 
TTC CGT GAG CGT GTG GAT TAG AG 

CTG GGG GAG GCT CAG ACT TAA ACT 

AGA TTT GAT GAT GGT TTG GTA CT 

AAA GAA GAG ATG ATG TCT TCT GCT 
ATG GGG ACG ACG TTG TAA AAG GGG 
TGG AGG TCT CGA CTA CTT CGT GA 

GAG GGG CGT AGT CTC CTG AAG GCG 
GCC TTT GTT AAC AGT AGT AAA GCC TCA 
GTA 

CTA GCT CTA TCA TCT GTT TCC CCT 
CCA TCT TTC TTT TTG GCC CAC ATT CA 

GGA CTT GTG CAT TTG GAG ATG CGA GG 

GTT GGG GAG CAT TAG CCG GGG GA 
GTC ATG TCC AAC TCT TTG AGA CC 
AGG NGG TGC CAT AAG TCT GTC AAG CAA 
GTG TGT GCT AGT GGA GAT CCT ATT 
TCC ACT TTG CTG CAG TTG GTG GT 

TGG AGA GTT GAG GCT CTG TCT CA 
ATC TGG GTC TCT CTG CAA AAC AGC 

GGG GCC ACA GGG ATG CCA GAT GGT 

TAG CAT TTC CCT GGC CAC AGT CCT 
GCT TGG AAA TAA CCC TCC TGC AT 
GGC CCA TCT TCT AGC TCA AGA AC 

GTC TAA GCC CAC CTC GTG GAT CCT 
TAA CAG ATA AAC CGA TCA CAC AAA T 

AAG CGA ACC CTG CCT TCA CTC CTG 

TTT CAG AGA GCC CCA AGT GGG AG 

CAT GAG GAG GAT ACC CAG CGG CA 

GGA GAG CAG TAC CAG GTC ACC AGA T 
GGC GGC TGC AAG AAA GTG CTT CGA 
GGG TAG GGA AAT ATA GGG GAG GAA A 

CTC CCA GCC CTA GCC CCT AGG AAG 
GAA ACA CTA CTA TTA AGC CGC ATT A 
CAT GGT CCA AAG CAC CTC GGA CA 

CCC ACA AGG GAG GGG ACC AAA AC 
TCC GGA TTC ACC TCT TGG AGG ATG 
GCT CTG GCT GAA CAC CCC TCC AA 


Oligonucleotide B (5'-3') 


GGC CCC ATA GTG TCC TGC TAG GTC 
GTC AAA GAT GGA TCC AAT TCT TGC T 

GTC CCA ATG TCA CAT ACT GAG TCT AT 
AAA GTT GTC CCC AGA GAT GAG AG 
TGA AGG CTA CTT GTT GCT CTG CC 
GCA TAG TGG TCA AGT TTG TTT CTC C 
CTG TCT ACT GTT GGT GTA CTC AC 

GAT GGA AAA GAA GCA ACA CGA A 

ACA AAA CAA CAC AGA GCC GGG AT 

TTT CCA AGT AGG TGG GAA TTT CG 
CAG TCG ATG CAA ATC TCT CTA ATC 

TGA TGG CCG GTG AGT TTA AG 

CTG AGC CAC CGG GGA AGT ACC AA 

CAT CTT GGG TGT CCC CAG CAG AAG 

CAG CTC AGG GTC CCC TTT T 

CTG AGG TTT TCG CAT TCA CGA AG 

TTC TTG ACA AGC GCC TTC A 
GCA GCG TCA ACT TAC TGA GG 

GGG AGG TCT CAT GAG GGG AAA 
ACA GCT GGG CTC ATG AAT GTG 

ACT TGG GTG ACC TCG CAT 

GAG TGT ACG TGA TGG TCC CCC GG 
AGG AGG GGC CTG GTG GGT CTC AA 
GAC ACA CAC AGA TTG CTT GA 
CAC GAA TTA GTA CAG GCA ATA GGA T 
GGC TTA GCT GGG AAC ACC ACC AA 

TGT CAT GGG GCT TGA TGA GAC AG 
GGC TAG TAG GGA TGC TGT TTT 

GCC AAA GAA GCT TTA CCA TCT G 

TGC TGT CTG CAG GGA GGA CTG T 
GCC AGA CAC TCT GGG ATG CAG GA 
ACC GGG AGA GGT GGT GTT CTT GA 

TGT CCC CCT AGA GAG GAT CCA 
GCT GCC GGC TCT CGT TTA TTT GTG T 

CCT CCT TTC TCA GCA GGA GTG 

GGG TCT TGT CTT CCC CTC CCA CT 

CGG TGA TAG GTC TCA TGC CGC TG 

TTT TTC GCT TTC CAT CTG GTG 
ACA TGC CCA TTT CGA AGC A 
TCC TTC CCT GGA TTT TCC TCC 

GTG TGT TTC TGT GAC TTC CTA G 
GGG GAG ACT CTT TCC TCT TAA TGC GG 
CCC CAT CCC CAG AGT TGT CCG AG 

GTC TTT TCT TGG TCT GTT TTG GT 
TGC TGT CAA GGT CCA TCC TCC 
GGA GCT GCA TAT CTT TTG GAG GG 


Forward PCR primer (5'-3') 
Reverse PCR primer (5'-3') 


GTC AAA GAT GGA TCC AAT TCT TGC T 
ATT TGA GAG CCC CTC TCC CC 


GGG AGG AGC CAG AAG GAT ACT 
GAG AAC CAG GGT GTG TGA TAA AG 
CGT GTT GCT TCT TTT CCA TCT G 
TTA GCG AAC AAG CTC AGG TGA 
TAT CGG CCC TTT GTA TCC TTG 
AGA ATC CAC AGG TGC TCC AAT C 

AAC TCC CCT TTT CTC TCC ACA A 
TCT CTA ATC CAC ACG CTC ACG 
TGA TGG CCG GTG AGT TTA AG 
CCA GAA TGC TGT GAA CAG GTG AA 
TGATGATGGTTTGGTACTTCCC 
GTG GAT TGC CAT TTC CTT CTT 


TGG AGG TCT CGA CTA CTT CGT GA 
GAT TTC TTG ACA AGC GCC TTC 

GCA GCG TCA ACT TAC TGA GG 
AGT TCC AGG CCA CAG AAT CCT 

ACA GCT GGG CTC ATG AAT GTG 
GTA GGC TGT AGG GAT TAC TGT CT 
ACT TGG GTG ACC TCG CAT 
CTG TTG TAA AGG AGA CGG GG 


CCT TGC TCA GGA CAC GCT AAG 
GTA AAC AGC TTC CTC AGA ACC CT 

GTC ACT GGT GAT TGT GGA TCC AGA 
GGC TAG TAG GGA TGC TGT TTT 
GCC AAA GAA GCT TTA CCA TCT G 
CCA AAT CCG ATG TTT CTG CTC 


AGG TTC CCG TGT TGA CCT 
GGA AAC AGC AGA GGT TAC CC 

CTG CTG GGA ATG GAC TTA CTC AG 

CCT CCT TTC TCA GCA GGA GTG 
GAC TTA AAA GCG CTC TAC ACA 
TTT GAC ATC TTT GTG AGT GGC CT 
GCT TGA ATA CAG AGG GCT CCC 
AGG ATC TGA CGC TGT CCA C 
GGC TTT GTG CTC TGG TCT CA 


AAA AGA CAA GAC ATC ATG GCC AAC 
TCC TTC CCT GGA TTT TCC TCC 


CCG ATG TGG ATC ACC ATG GTC 
ACT GCA GTG GGG AGA TGG AG 


TGT CTG GGT CTC TGG CTC AGT 
GAC TCA CCA TGG CCC CAA AG 
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Table 1 (continued) 


Symbol 

Alias 

U5-116KD 

U5-116KD 

GFAP 

GFAP 

BMS501 

BMS501 

MTMR4 

MTMR4 

GH1 

GH1 

FASN 

FASN 

IDVGA48 

IDVGA48 

IDVGA44 

IDVGA44 

ETH3 

ETH3 

RM388 

RM388 

KIAA0585 

SMJW37 

BMS601 

BMS601 

BMC 1013 

BMC 1013 


Oligonucleotide A (5'-3') 


AAG TCT TCT TTT TGT TCT GCC AGC A 
TAT GAC TCC ACT CCC TCT GCC ATC 
AGC GTC CTT GCC AAC ATC AGG ACT G 
GGG ACT GTC CCA TTA ATG AAC ATA 
TCA GCC GTA TTT TAT CCA AGT AGG 
GGA GGA GTT CTG GGC CAA TCT CAT 
CTC CAT GTC ACC TGG TCC AGG TGC 
CTA CTC AGG GAT CCA ACC AGA GTC TC 
GGG AAG CCC GCC TAC TTG GCC AC 
GTT GGG GAG CAT TAG CCG GGG GA 
AGG CAG CAA GTT CTT CGG TTT CT 

AGG TTC ACT AGG ACG ATG CTC TCA 
GTA GTT CAG CTT CCA CGG AGC TAA A 


Oligonucleotide B (5'-3') 


CCA CGT CTT CCC ATG TAT GCT GGC A 
CAT TTC AAT GGA AAG AGA TGG CAG 
GCC TAA TCA GGT ACA GTC CTG 
CTC CAA GGC CAT GCA GTA TGT TCA 
CTC CCC TAA CCA CAT CCC TAC TTG 
TCC ACA CCG CCA ATG AGA TT 
GTC AAC CTT GGT CTG CAC CTG G 
CTT CCA ATG CAG GAG ACT CT 
GAT TTT ACT CTG CCT GTG GCC AA 
GAG TGT ACG TGA TGG TCC CCC GG 
ATG CAC CCT CTT CCT AGA AAC CG 

CAG NAT AGA AGG AAC CTG AGA GCA 
CCT CTT TCC ATA TGT TAG TTT AGC TC 


Forward PCR primer (5-3') 
Reverse PCR primer (5'-3') 


GGC TCT ACA CAC ATT CGC C 
GGA GGA GTG CTA ACC ACA CC 


Table 2. BAC contigs based on loci assignments to individual BAC 
clones 


Clone number 

Loci 

CONTIG 1 

EO77G03 

PSMB6, SLC25A11, KIAA0909 

EO77O05 

PSMB6, SLC25A11, KIAA0909 

EO 174002 

PSMB6, MGC2491 

EO202L22 

PSMB6, SLC25A11 

EO264J20 

PSMB6, SLC25A11 

EO8N04 

SLC25A11, KIAA0909 

EO29I05 

SLC25A11, KIAA0909 

EO72N10 

SLC25A11, KIAA0909 

EO103G01 

SLC25A11, KIAA0909 

EO138G01 

SLC25A11, KIAA0909 

EO60P24 

MGC2491, ALOX12 

E066J24 

MGC2491, ALOX12 

EO87H20 

MGC2491, ALOX12 

E0111M17 

MGC2491, ALOX12 

CONTIG 2 

E0223H14 

HSPC002, STRA12 

EO98G08 

HSPC002, STRA12, GLUT4, PLSCR3 

E0169E16 

HSPC002, STRA12, GLUT4, PLSCR3 

E0278H24 

HSPC002, STRA12, GLUT4 


ticular regions of the BAC scaffold map. In addition, several 
areas of possible uncertainty are highlighted and these can be 
explored in more detail. Finally, evolutionary breakpoints be¬ 
tween the bovine and human are highlighted. Not only will 
these results aid in the final construction of the BAC map, but 
also they represent an important step towards the ultimate goal 
of sequencing the bovine genome. 

Acknowledgements 

The authors are grateful to Chung Li Shu and Barbara Swiatkiewcz for 
their involvement in constructing the RPCI-42 and CHORI-240 libraries. 


CONTIG 3 

E0146F17 (G73124)/OSBPL7, G73129, G73147 

EO176N10 (G73124)/OSBPL7, FLJ13660, G73129, G73147 


HSA10 region and clones EO174002, EO60P24, E066J24, 
EO87H20 and E0111M17 that contain both HSA17 and 
HSA10 sequences. Sequencing these clones may therefore very 
precisely define a breakpoint between HSA10 and HSA17 and 
BTA19. 

Band et al. (2000) found a small region of HSA5 was present 
on BTA19. In this study we failed to detect the locus in ques¬ 
tion (CCNG1) in the CHORI-240 library. Thus there remains 
the possibility that human chromosomal regions other than 
HSA10 and HSA17 are present on BTA19. 

In conclusion, the results of this study provide confirmation 
of the positions of many of the markers and genes that have 
been previously mapped to the bovine and assigns them to par- 
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motor neuron gene (SMN) in domestic bovids 

L. lannuzzi, 3 G.P. Di Meo, a A. Perucatti, 3 R. Rullo, a D. Incarnato, 3 M. Longeri, b 
G. Bongioni, d L. Molteni, c A. Galli, d M. Zanotti b and A. Eggen e 

a National Research Council (CNR), ISPAAM, Laboratory of Animal Cytogenetics and Gene Mapping, Naples; 
b Institute of Zootechnics, Faculty of Veterinary Medicine, Milan; 
c Institute of Animal Production, Agricultural Faculty of Science, Milan; 
d Experimental Institute Lazzaro Spallanzani, Milan (Italy); 

e INRA, Department of Animal Genetics, Laboratory of Biochemical Genetics and Cytogenetics, Jouy-en-Josas (France) 


Abstract. A comparative fluorescence in situ mapping of 
the SMN gene was performed on R-banded chromosome prep¬ 
arations of cattle (Bos taurus, BTA, 2n = 60), river buffalo (Bu- 
balus bubalis , BBU, 2n = 50), sheep (Ovis aries , OAR, 2n = 54) 
and goat (Capra hircus , CHI, 2n = 60), as well as on those of a 
calf from Piedmont breed affected by arthrogryposis. SMN 
was located on BTA20ql3.1, OAR16ql3.1, CHI20ql3.1 and 
BBU19ql3. These chromosomes and chromosome bands are 
believed to be homeologous, confirming the high degree of 


chromosome homeologies among bovids. The position of SMN 
was refined in cattle, compared to the two previous localiza¬ 
tions, while it is a new gene assignment in the other three 
bovids. A comparative fiber-FISH performed on extended 
chromatin of both normal cattle and calf affected by arthro¬ 
gryposis revealed more extended FITC signals in the calf, com¬ 
pared to the normal cattle (control), suggesting a possible dupli¬ 
cation of the SMN gene in the calf affected by arthrogryposis. 

Copyright©2003 S. Karger AG, Basel 


Arthrogryposis is a complex of symptoms characterized by 
limbs congenital contractures having different origin. Recently, 
it has been indicated as an autosomal recessive condition with 
congenital arthrogryposis in Suffolk lambs (Doherty et al., 
2000) and in different cattle breeds (Goonewardens and Berg, 
1976; Nawrot et al., 1980; Pumarola et al., 1997; Pozzatti, 
2002). In Piedmont breed a form of arthrogryposis has been 
recorded showing the frequency of 1.8% (Longeri et al., 2003). 
In humans, evidence has been reported that a heritable form of 
limb contracture with muscular atrophy (SMA) is caused, in 
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many cases, by disruption of the telomeric copy of a duplicated 
gene called SMN1 mapping on HSA5ql3 (Lefebvre et al., 
1995). 

SMN has been FISH-mapped on cattle chromosome 20ql2- 
13 (Pietrowski et al., 1998) and 20ql4 (Eggen et al., 1998a). In 
this study, we comparatively FISH-map SMN in cattle, river 
buffalo, sheep and goat by refining its position in cattle and 
assigning, for the first time, this important gene to sheep, goat 
and river buffalo chromosomes. Furthermore, a comparative 
fiber-FISH on extended chromatin of both normal cattle and 
Piedmont calf affected by arthrogryposis revealed a possible 
duplication of the SMN gene in the latter. 


Materials and methods 

Peripheral blood cell cultures from normal cattle (BTA) and a calf from 
Piedmont breed affected by arthrogryposis, river buffalo (BBU), sheep 
(OAR) and goat (CHI) were treated for late incorporation of both BrdU 
(15pg/ml) and Hoechst 33258 (30 pg/ml) to obtain enhanced R-banded 
preparations. As a probe, a mixture of two BAC-clones (20 ng/pl) containing 
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Fig. 1. (a-d) Details of comparative FISH- 
mapping with BAC-clones containing the SMN 
gene in cattle (a), river buffalo (b), sheep (c) and 
goat (d). The simultaneous visualization of FITC- 
signals (green) and RBPI-banding (red) revealed 
the presence of the SMN gene on BTA20ql3.1, 
BBU19ql3, OAR16ql3.1 and CHI20ql3.1 (ar¬ 
rows). All details have the same magnification, 
(e-l) Details of the fiber-FISH performed on re¬ 
leased chromatin of normal cattle cells (e, g, i) 
and of cells from calf affected by arthrogryposis (f, 
h, I). Arrows indicate the sites of the FITC-signals 
on single chromatin fibers (e, f, g, h, I: white 
arrows) and on two chromatin fibers (i: white and 
red arrows). Note the longer extension (at least 
double) of FITC-signals on chromatin fibers of 
calf affected by arthrogryposis (f, h, I). All details 
have the same magnification. 



the SMN gene (Eggen et al., 2001) was employed. After a detection step with 
FITC-avidin and anti-avidin antibody, slides were stained with propidium 
iodide (5 pg/ml in distilled water), mounted in antifade and observed under a 
DM Leica fluorescence microscope (filter combination 13). A fiber-FISH 
with the same clones (30 ng/pl) was also performed on extended chromatin 
from fixed cells of both normal cattle and calf affected by arthrogryposis 
following the protocol of Fidlerova et al. (1994). FITC-signal detection on 
extended chromatin was done as reported above, except that slides were 
mounted in antifade/Hoechst 33258. Relative chromosome lengths of fiber- 
FISH FITC signals were calculated on the basis of cattle chromosome 1 
which is considered to be about 10 pm long at the metaphase stage. Meta¬ 
phase plates showing FITC signals (green) and RBPI-banding (red) and chro¬ 
matin fibers showing FITC signals were captured by using a color camera 
(Photometries, Coolsnal CF). Chromosome identification and banding num¬ 
bering system followed the ISCNDB (2000) for cattle, sheep and goat and the 
CSKBB (1994) for river buffalo. 


Results and discussion 

Thirty metaphases for each species and animal (cattle) were 
studied. The simultaneous visualization of FITC signals and 
RBPI-banding allowed a precise localization of the hybridiza¬ 
tion signals of the SMN gene on BTA20ql3.1, OAR16ql3.1, 
CHI20ql3.1 and BBU19ql3 (Fig. la-d). These chromosomes 
and chromosome bands are believed to be homeologous, con¬ 
firming the high degree of chromosome homeologies among 
bovids. The frequency of FITC signals (chromosomes with sin¬ 
gle or double-spot in one or both chromatids) varied from 22 % 
in sheep to 90% in cattle and river buffalo. The position of 
SMN was refined in cattle, compared to localizations achieved 
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earlier on 20ql2-13 (Pietrowski et al., 1998) and 20ql4 (Eggen 
et al., 1998a), and assigned, for the first time, in sheep, goat and 
river buffalo chromosomes. 

Since the presence of variability in RACE amplification of 
SMN 3 / -end has been previously recorded suggesting a gene 
duplication (Longeri et al., 2003) and the expression of SMA in 
humans seems to be correlated to multiple copies of the SMN2 
gene (Vitali et al., 1999), we tried to check if duplications of the 
SMN gene occurred in the Piedmont calf affected by arthrogry¬ 
posis by comparing the extension of FITC signals on released 
chromatin by using a fiber-FISH on both normal cattle cells 
(control) and cells of calf affected by arthrogryposis (Fig. le-1). 
Upon 20 preparations (fibers with FITC signals) we examined 
in both normal cattle (control) and affected calf, a pronounced 
extension of FITC signals (mean value = 7.6 ± 2.2 pm) was 
observed in the latter (Fig. If, h, 1), compared to that (mean 


value = 3.1 ± 0.6 pm) achieved in cells of normal cattle 
(Fig. le, g, i). This strongly suggests a possible duplication of 
the SMN gene in the Piedmont calf affected by arthrogryposis. 

In human patients affected by SMA, multiple copies of 
SMN2 genes, as a possible gene dose compensation of SMN 1 
deletion, were correlated with varying severity of this disease 
(Vitali et al., 1999). Eggen et al. (1998b), by using a microsatel¬ 
lite contained in a cosmid with MAP IB (near to SMN), did not 
find close linkage with SMA in cattle, although they examined a 
small Brown Swiss pedigree. So far arthrogryposis in Piedmont 
cattle seems to be a different form from the Brown cattle one. 
Thus, more cases affected by arthrogryposis should be investi¬ 
gated in Piedmont cattle to get more precise conclusions on the 
genetic structure of this disease in this economically important 
species. 
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Abstract. We report the cloning and initial characterization 
of the genes encoding DGAT2 (diacylglycerol transferase 2), 
MOGAT1 and MOGAT2 (monoacylglycerol transferases 1 
and 2) in domestic cattle (Bos taurus). The three closely related 
genes belong to a gene family with at least eight members in 
mammals and are candidate genes for quantitative traits relat¬ 
ed to dietary fat uptake, lipid synthesis and storage. MOGAT2 
and DGAT2 form a tandem and were mapped to bovine chro¬ 
mosome (BTA) 15q25—>q26 by fluorescence in situ hybridiza¬ 
tion. MOGAT1 was localized to BTA 2q43-^q44. The three 
genes were investigated for polymorphisms that might be asso¬ 
ciated with breeding values for milk fat percentage in the dairy 
breeds German Holstein, German Simmental and German 
Brown. All the detected polymorphisms were located outside 
exons or, with one exception, were silent. In MOGAT1, a mis- 


sense mutation in exon 4 was found that causes a non-conserva¬ 
tive substitution of cysteine 170 (uncharged, hydrophobic) by 
lysine (positively charged, hydrophilic). However, allele fre¬ 
quency estimates from pooled DNA samples revealed no signif¬ 
icant association of the observed polymorphisms with breeding 
values for milk fat percentage. A comparative analysis of chro¬ 
mosomal locations and exon-intron structure of the known 
members of the DGAT2/MOGAT gene family in humans, 
rodents and cattle indicates an ancient tandem duplication of 
the ancestor gene combined with an intron gain (or loss) in one 
copy. Further members of the family may have arisen by dupli¬ 
cations of this gene tandem via two rounds of interchromoso- 
mal or genome duplications as well as further local (single) gene 
duplication and loss events. 

Copyright©2003 S. Karger AG, Basel 


Triglycerides (triacylglycerols) are the major energy storage 
molecules in eukaryotes. The final, and presumably rate-limit¬ 
ing step of triglyceride synthesis is catalyzed by a diacylglycerol 
acyltransferase (DGAT) (Mayorek et al., 1989). DGAT1 was 
the first identified gene encoding a protein with DGAT activity 
(Cases et al., 1998). A missense mutation (Lys 232 —>Ala) in 
DGAT1 has been shown to be significantly associated with 
variation in milk fat percentage in cattle. DGAT1 is likely cau- 
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sal for a QTL near the centromere of bovine chromosome 14 
(Grisart et al., 2002; Winter et al., 2002). Generation of viable 
DGAT 1-knockout mice (Smith et al., 2000) revealed that 
DGAT-like activity is found in other enzymes encoded by oth¬ 
er genes and led to the detection of DGAT2. In humans, 
DGAT2 is expressed in many tissues. Highest mRNA levels 
were found in the liver, white adipose tissue and the mammary 
gland (Cases et al., 2001). 

DGAT2 was the first identified member of a gene family 
with at least eight members in mammals (Cases et al., 2001). To 
date, this family has not been fully characterized in any single 
mammalian species. As such, the nomenclature for the family 
has not been finalized. This is especially the case with those 
members encoding monoacylglycerol acytransferase activity. 
The latter are referred to as “MGATs” in the literature but have 
been designated provisionally as “MOGATs” by the Nomen¬ 
clature Committee of the Human Genome Organization 
(HGNC; Wain et al., 2002) because the MGAT symbol is 
reserved for another gene family. We follow the usage of the 
HGNC herein. 
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DGAT2 and its relatives encode intrinsic membrane pro¬ 
teins that are completely unrelated to DGAT1. Hydrophobic 
analysis of the respective amino acid sequences reveals nine 
putative transmembrane domains in human DGAT1, but only 
two in human DGAT2 (Oelkers et ah, 1998; Cases et ah, 2001). 
Recently, three members of this gene family have been charac¬ 
terized in mice: MOGAT1 (Yen et al., 2002), MOGAT2 (Cao 
et al, 2003; Yen and Farese, 2003) and MOGAT3 (Cheng et 
al., 2003). MOGAT enzymes catalyze the synthesis of triglycer¬ 
ides from 2-monoacylglycerols and acyl-CoA. The so-called 
monoacylglycerol pathway is essential for intestinal dietary fat 
resorption. Here we report the cloning, physical mapping and 
sequence analysis of DGAT2, MOGAT 1 and MOGAT2 in cat¬ 
tle (Bos taurus). Additionally, we screened for intragenic poly¬ 
morphisms and performed an initial association study of the 
three genes with milk fat percentage in the three dairy breeds 
German Holstein, German Simmental and German Brown. 

Materials and methods 

Isolation of genomic clones 

Human nucleotide sequences (GenBank accession nos. DGAT2, 
BC015234; MOGAT 1, AF384163; MOGAT2, AK026297; DGAT2L3, 
XM_088691; DGAT2L4, XM_088683) and BLAST algorithms (Altschul et 
al., 1990) with translated nucleotide sequences (TBLASTX) were used to 
search the expressed sequence tags database (dbEST) of GenBank (Boguski 
et al., 1993). EST sequences were found for bovine DGAT2 (GenBank acces¬ 
sion nos. BE724193, BI536057, AW326247, BI681948, BE482224, 
BE479873, BF868335, BG694175, BG687855 and BF430191) and bovine 
MOGAT1 (GenBank accession nos. AW429404 and BE754760). The ten 
EST sequences for DGAT2 represent the complete mRNA sequence from 
exons 1-8. The consensus sequence of the two ESTs for MOGAT 1 covers 
exons 1-3 and 6. Bovine ESTs were assembled into consensus mRNA 
sequences. Human and mouse sequence data (sources: NCBI, http://www. 
ncbi.nlm.nih.gov/ and Ensembl, http://www.ensembl.org/) were used to ob¬ 
tain putative splice sites. PCR primers were designed using Primer3 software 
(Rozen and Skaletsky, 1998) and considering splice sites and intron sizes in 
humans. PCR primers for MOGAT 1 exons 4 and 5 were derived from the 
human cDNA sequence (GenBank accession no. AF384163). Initial PCR 
primers for MOGAT2 were derived from a porcine MOGAT2 EST sequence 
(GenBank accession no. BE030672) spanning exons 1-4. The following 
primers and amplification products were used to screen the gridded bovine 
BAC library RPCI-42 (Warren et al., 2000): 

DGAT2: 807 bp fragment (exon 5 - exon 6) 

forward 5 / -CAGGAACTACATCTTTGGGTACCA-3 / 

reverse 5 / -ATTGCCACTCCCATTCTTTG-3 / 

MOGAT 1: 347 bp fragment (intron 5 - exon 6) 
forward 5 '-AC A AT CC AGC AT GT GC AG AG- 3' 

reverse 5 / -CTGGAATACCATACTTCCCTTTG-3 / 

MOGAT2: 422 bp fragment (exon 3 - exon 4) 
forward 5 / -CCCCCATCTGATGATGCT-3 / 
reverse 5 '-TGCT C AGG AT GT G AGC AGC- 3' 

BAC DNA was prepared using QIAGEN Plasmid Midi Kit (Qiagen, 
Hilden, Germany). The specificity of the isolated BAC clones was confirmed 
by PCR amplification and sequencing of the respective DNA fragments. 

Fluorescence in situ hybridization (FISH) 

Purified BAC DNA from clones RPCI42-269A1 (DGAT2 and MO- 
GAT2) and RPCI42-307A24 (MOGAT 1) was labeled with digoxigenin- 
dUTP by standard nick translation and hybridized with 10* excess of bovine 
Cot 1-DNA to normal male bovine metaphase spreads. Probe hybridization 
was detected with monoclonal mouse-anti digoxigenin (Roche, Mannheim, 
Germany) and sheep-anti-mouse-FITC antiserum (Sigma-Aldrich, Deisen- 
hofen, Germany). The chromosomal gene locations were assessed according 
to the standardized karyotype of domestic cattle Bos taurus (ISCNDC, 2000) 
by measuring the relative fractional length from the long arm telomere to the 


hybridization signal (Fl qter ) and by comparison with the G-band-like DAPI 
staining pattern. Chromosome measurements were made using the software 
program MicroMeasure (Reeves and Tear, 2000). 

Genomic DNA sequencing and long-range PCR 

BAC clones RPCI42-5L16 (DGAT2), RPCI42-362M12 (MOGAT 1) and 
RPCI42-20B12 (MOGAT2) were used for direct sequencing using an ABI 
377 automated sequencer and BigDye kit v2.0 (Applied Biosystems Divi¬ 
sion, Foster City, CA, USA). To assess the size of the larger introns, 20 ng of 
purified BAC DNA was amplified by long range PCR in 20 pi reactions con¬ 
taining 2 units of AmpliTaq Polymerase (Qiagen, Hilden, Germany), 0.1 
unit of ProofStart DNA Polymerase (Qiagen, Hilden, Germany), lx of 
AmpliTaq PCR buffer, 1.5 mM of MgCl 2 , 300 pM of each nucleotide, 
0.5 pM each of forward and reverse primer, 4 pi of Qiagen Q-solution and 
2% DMSO. The PCR profile included 2 min at 95 °C, 35 cycles of 10 s at 
94°C, 1 min at 61 °C and 20 min at68°C. 

Polymorphism analysis 

Semen samples were obtained from bulls of the dairy breeds German 
Holstein (2 x 32 animals), German Simmental (2 x 32) and German Brown 
(2 x 20). To search for gene variants associated with milk fat percentage in 
each breed, bulls with extreme high (+) and low (-) breeding values for milk 
fat percentage were selected. Equal amounts of individual DNA-samples 
were pooled as described previously to minimize effort (Winter et al., 2002). 
Screening for polymorphisms in exons and smaller introns was done by rese¬ 
quencing the six pooled DNA samples as well as 12 individual DNA samples 
(German Holstein and German Simmental only). 

Resequencing used a 20 pi PCR reaction containing 50 ng of genomic 
DNA, 0.5 units of HotStar Taq Polymerase (Qiagen, Hilden, Germany), lx 
PCR buffer, 1.5 mM of MgCl 2 , 200 pM of each nucleotide and 0.5 pM of 
each primer. The cycling profile was initial denaturation for 5 min at 95 °C; 
35 cycles of 1 min at 94°C, 1 min at 60°C and 1 min at 72 °C; and final 
elongation of 3 min at 72 ° C. PCR products were purified using MultiScreen- 
PCR filtration plates (Millipore, Eschborn, Germany). The purified PCR 
products were sequenced using an ABI 377 automated sequencer (see above). 
Sequence data were analyzed using the Phred/Phrap/Polyphred/Consed soft¬ 
ware suite (Nickerson et al., 1997; Ewing and Green, 1998; Ewing et al., 
1998; Gordon et al., 1998). Allele frequencies were estimated by analyzing 
sequence traces from pooled DNA. For each polymorphism, normalized 
amplitude values of the two alternative bases from the pooled DNA were 
compared with their normalized amplitude values from homozygous and 
heterozygous individuals as described previously (Winter et al., 2002). The 
analysis was automated using Python scripts (available from authors upon 
request). 


Results and discussion 

A BAC clone containing both DGAT2 and MOGAT2 was 
mapped to bovine chromosome (BTA) 15q25—>q26 (Fig. 1A). 
This agrees with the observation that DGAT2 and MOGAT2 
also form a tandem some 40 kb apart in both humans and mice. 
MOGAT 1 was assigned to BTA 2q43->q44 (Fig. IB). FISH 
results and comparative mapping data are summarized in 
Table 1. In humans, mice and rats, eight members of the 
DGAT2/MOGAT gene family have been found, located on 
four different chromosomes. Together with cattle, all known 
family members in these species are located at the known 
respective orthologous chromosome segments. Current homol¬ 
ogy maps between any of humans, mice, rats and cattle (MGI: 
Mouse Genome Informatics, http://www.informatics.jax.org/, 
May 2003; Fronicke and Wienberg, 2001) allowed us to predict 
the chromosomal position of as yet unidentified DGAT mem¬ 
bers in cattle (e.g. on chromosome X). 

Sequences of smaller introns and all exons except the last 
exon of MOGAT2 have been deposited in GenBank under the 
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A 


I 


B 
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15q25-q26 


MOGAT2/DGAT2 



MO GATI 



Fig. 1. Physical localization of DGAT2 and MOGAT2 (A) and MOGAT1 (B) in domestic cattle (Bos taunts) by fluorescence in 
situ hybridization. BAC clones containing the respective genes were hybridized to metaphase spreads from a normal bull. Chromo¬ 
somes are counterstained with DAPI (pseudo-colored in red). Scale bar = 10 pm. 


Table 1 . Chromosomal locations of DGAT2/ 

MGAT gene family members Gene n L q t er a ±SD Chromosomal position b _ 

Cattle Human Mouse Rat 


MOGAT1 

21 

0.17 ±0.03 

2q43-q44 

2q36.2 

1C4 

9q33 




AJ519785 C 

BN000154 C 

AF384162 d 

XM_237315 C 

MOGAT2 

22 

0.27 ± 0.04 

15q25-q26 

11 q 13.5 

7E1 

lq32 




AJ519786 C 

AY157608 d 

AY157609 d 

XM_218952 C 

DGAT2 



15q25-q26 

1 lq 13.5 

7E1 

lq32 




AJ519787 C 

BC015234 d 

AF384160 d 

AJ487787 d 

MOGAT3 



25? 

7q22 

5G1 

12q 12 





AY229854 d 

AC079872 6 

RNORO1027916 

DGAT2L7 



25? 

7q22 

5G1 

12q 12 





BN000168 C 

AC079872 6 

XM_222084 C 

DGAT2L3 



X? 

Xql2 

XC3 

Xq21 





BN000155 C 

XM_141972 C 

XM_228568 C 

DGAT2L4 



X? 

Xql2 

XC3 

Xq21 





BN000156 C 

XM_141969 C 

XM_228583 C 

DGAT2L6 



X? 

Xql2 

XC3 

X? 





BN000157 C 

XM 141971° 



a L q ter= relative fractional length from the long arm telomere to the signal position ± standard deviation. 
b Chromosomal positions in humans, mice and rats were derived from genome draft sequences 
(http://www.ncbi.nlm.nih.gov/; http://www.ensembl.org). Question marks indicate putative gene family members 
and their chromosomal position as predicted from current homology maps (MGI: Mouse Genome Informatics, The 
Jackson Laboratory, Bar Harbor, Maine, http://www.informatics.jax.org, May 2003; Fronicke and Wienberg, 
2001 ). 

c Accession number refers to predicted mRNA. 
d Accession number refers to cDNA. 
e Accession number refers to genomic DNA. 


following accession numbers: DGAT2 mRNA: AJ519787; 
DGAT2 gDNA: AJ534368, AJ534369, AJ534370, AJ534371, 
AJ534372; MOGAT1 mRNA: AJ519785; MOGAT1 gDNA: 
AJ534373, AJ534374, AJ534375, AJ534376; MOGAT2 trunc : 
AJ519786; MOGAT2 gDNA: AJ534377, AJ534378, 

AJ534379. Compared to the respective human sequence, bo¬ 
vine DGAT2, MOGAT1 and MOGAT2 show 90.2%, 84.1% 
and 80.2% identity of the coding sequence. 

The gene polymorphisms that were identified by an initial 
screening in the dairy breeds German Holstein, German Sim- 


mental and German Brown are shown in Table 2. Resequenc¬ 
ing of DGAT2 revealed 22 SNPs (single nucleotide polymor¬ 
phisms) and a six-nucleotide insertion, all in untranslated 
regions. In intron 6 we found 12 linked SNPs (ID 293-304); 
individual sequencing of 38 animals showed that all were either 
homozygous for one of two haplotypes or heterozygous for all 
12 SNP loci indicating the existence of only two haplotypes. In 
MOGAT1, we found three intron SNPs and a missense muta¬ 
tion (SNP ID 347) in exon 4 that leads to a non-conservative 
substitution of cysteine 170 (uncharged, hydrophobic) by lysine 
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Table 2. Polymorphisms in bovine DGAT2, 

MOGAT1 and MOGAT2 DGAT2 MOGAT1 MOGAT2 


SNP ID a Position Allele SNP ID a Position Allele SNP ID a Position Allele 


12 12 12 


intron 4 (AJ534371) b 



intron 1 (AJ534374) 



5' upstream (AJ534377) 



338 

433 

A 

G 

346 154 

G 

del 

350 109 

G 

A 








351 156 

G 

C 

intron 5 (AJ534371) 



exon 4 (AJ534374) 



352 361 

G 

A 

291 

755 

A 

G 

347 426 

G 

C 

353 421 

A 

C 

- 

959-66 

ins c 

T 

Cys 170 - Ser 



354 463 

C 

T 

292 

1004 

A 

G 




355 467 

G 

C 





intron 5 (AJ534374) 



356 513 

G 

C 

intron 6 (AJ534371) 



348 1485 






293 

1501 

G 

T 


C 

T 

5'UTR (AJ534377) 

G 

A 

294 

1514 

T 

C 

intron 5 (AJ534376) 



357 578 



295 

1541 

C 

G 

349 502 






296 

1578 

C 

T 


C 

T 

exon 1 (AJ534377) 



297 

1614 

A 

G 




358 618 

G 

A 

298 

1637 

A 

G 




silent 



299 

1694 

A 

C 







300 

1740 

C 

T 




intron 1 (AJ534377) 



301 

1766 

A 

del 




359 724 

C 

T 

302 

1927 

G 

A 




360 727 

C 

A 

303 

2012 

T 

G 




361 757 

C 

T 

304 

2065 

T 

C 




362 760 

T 

C 

intron 7 (AJ534372) 






exon 2 (AJ534378) 



339 

349 

A 

G 




363 290 

G 

A 

340 

357 

A 

G 




silent 



341 

396 

A 

G 







342 

448 

A 

G 




intron 4 (AJ534379) 



343 

481 

A 

G 




365 1188 

A 

G 

344 

668 

C 

G 




366 1212 

C 

A 

3TJTR (AJ534372) 






intron 5 (AJ534379) 



345 

975 

A 

G 




367 1725 

T 

C 


SNP ID refers to the SNPZoo database, freely accessible via http://www.snpzoo.de/ (Fries and Durstewitz, 
2001 ). 

b GenBank database accession numbers are indicated in brackets. 
c ins = CCCTGGCA. 


(positively charged, hydrophilic). In MOGAT2, we found 15 
SNPs outside exons and two silent exon SNPs (ID 358 and 
363). A standard chi-square test did not reveal a significant 
association of allele frequencies with breeding values for milk 
fat content in any of the three analyzed dairy breeds (data not 
shown). However, the upstream regulatory regions have not yet 
been analyzed. 

The exon-intron structure of bovine DGAT2, MOGAT1 
and MOGAT2 is shown in the upper part of Fig. 2. All splice 
sites follow the GT-AG rule (Breathnach et al., 1978). The 
exon-intron borders of bovine DGAT2, MOGAT1 and MO- 
GAT2 are completely conserved with respect to their human 
orthologues with the single exception that exon 1 of DGAT2 in 
cattle consists of only the first 40 bp of the corresponding exon 
in humans. Human MOGAT2 has a truncated splice variant 
that is terminated by a stop codon in intron 4 and encodes a 
protein without MOGAT activity (Yen and Farese, 2003). 
Bovine MOGAT2 also contains a stop codon in intron 4 sug¬ 
gesting that a similar splice variant may exist in cattle. Exon 6 
of bovine MOGAT2 has not yet been sequenced. 


With few exceptions, gene structure is conserved extensively 
within the DGAT2/MOGAT gene family as shown in the lower 
part of Fig. 2. An additional intron splits exon 2 of MOGAT 1 
and MOGAT2 into two exons in all other family members. 
This structural difference divides the gene family into a 
DGAT2 lineage (DGAT2, MOGAT3, DGAT2L3, DGAT2L4, 
DGAT2L6) and an MOGAT lineage (MOGAT 1, MOGAT2, 
DGAT2L7). DGAT2 has also acquired an additional (first) 
exon at the 5 7 end that has no equivalent in the other family 
members. The three X-chromosome family members 
(DGAT2L3, DGAT2L4, DGAT2L6), whose function is still 
unknown, show conservation of the last six exons compared 
with DGAT2 and MOGAT3. We are currently investigating 
chromosomal positions, gene structure changes and coding 
sequences in additional distantly related mammals to get more 
insight into the evolution of the gene family. Preliminary 
results indicate that the family originated with an ancient tan¬ 
dem duplication of the ancestral gene combined with an intron 
gain (or loss) in one copy. This event possibly predated verte¬ 
brates. One gene copy gave rise to the MOGAT lineage, the 
other to the DGAT2 lineage (see above). This gene tandem was 
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Bos taurus 

Chr. bp 

MOGAT1 2 1008 

MOGAT2 15 1005 

MOGAT2 trunc 678 

DGAT2 15 1086 



94 


12 kb 


179 


564 


205 


7 kb 


175 


632 


200 


6.7 kb 


155 



91 



? 205 309 175 603 200 

206 


? 155 

* 



40 15 kb 129 6.5 kb 108 369 71 6.3 kb 205 569 175 802 203 2 kb 155 


Homo sapiens 



Chr. 

bp 

MOGAT1 

2 

1008 

MOGA T2 

11 

1005 

MOGAT2 trunc 

752 

MOGAT3 

7 

1026 

DGAT2 

11 

1167 

DGAT2L3 

X 


DGAT2L4 

X 


DGAT2L6 

X 




yy 



94 16 464 


91 



109 



2012 


yy 


230 

yy- 


179 


742 


205 


4 892 


175 


552 

* 


200 


14 447 


179 


7264 


205 


330 


/\ 


175 

380 


645 


200 


21 441 



yy 


yy 



yy 



yy 



yy 


108 



174 

yy 


71 




1403 

-yy- 


205 



260 

yy- 


175 


1801 


203 





66 

yy- 


155 



155 



155 



121 15 505 


129 5 441 108 365 71 6426 205 625 175 894 203 1 924 155 

* 



108 

245 

71 

904 

205 

963 

175 

1 351 

212 

201 

155 

85 

355 

71 

243 

205 

234 

175 

682 

200 

224 

155 

108 

430 

71 

372 

205 

1 430 

175 

2 240 

212 

435 

155 


Fig. 2. Predicted exon-intron structure of DGAT/MOGAT gene family members in cattle (upper part) and humans (lower 
part). Exons are represented by black boxes and numbered in white. Exon and intron sizes are indicated in bp (or kb). The 
automatically predicted Ensemble exon-intron structure (http://www.ensemble.org) was manually corrected (Ensembl Accession 
numbers: MOGAT1, ENST00000264412; MOGAT3, ENSG00000106384; DGAT2L7, ENST00000323003; MOGAT2, 
ENSG00000166391; DGAT2, ENSG00000062282; DGAT2L3, ENSG00000180526; DGATL4, ENSG00000147160; DGATL5, 
ENSESTG00000023617). Human MOGAT2 has a truncated splice variant that is terminated by a stop codon in intron 4 (white 
box). Bovine MOGAT2 also contains a stop codon in intron 4 suggesting that a similar splice variant may exist in cattle. Exon 6 of 
bovine MOGAT2 has not yet been sequenced. 


subject to subsequent interchromosomal duplications or ge¬ 
nome duplications (tetraploidizations). In recent mammals, at 
least two such mixed gene tandems are still present: MOGAT2/ 
DGAT2 (on human Chr 11, mouse Chr 7, rat Chr 1 and cattle 
Chr 15) and DGAT2L7/MOGAT3 (on human Chr 7, mouse 
Chr 5, rat Chr 12; for details, see Table 1). 

The evolutionarily conserved tandem arrangement of a 
MOGAT and a DGAT gene might facilitate a concerted regula¬ 
tion of transcription (e.g. by a common upstream regulatory 
region). This would make sense because MOGAT provides the 
substrate for DGAT2. The three DGAT2 lineage members on 
the X chromosome likely resulted from local duplication 
events. 

Information regarding the actual physiological role of 
DGAT2 and the closely related MOGAT genes in vivo is only 
just becoming available. High mRNA expression in liver tissue 
indicates that DGAT2 could be especially important for the 
assembly of fatty acids synthesized de novo from excess carbo¬ 
hydrates into VLDL lipoproteins. High expression of DGAT2 
in adipose tissue also suggests a significant function in triglycer- 
id storage (Cases et al., 2001). Recent experiments in mice 
revealed that DGAT2 mRNA expression is stimulated by in¬ 
sulin (Meegalla et al., 2002). MOGAT 1, MOGAT2 and MO- 


GAT3 are characterized by distinct tissue-specific expression 
patterns and substrate specificity of the encoded enzyme. MO¬ 
GAT 1 mRNA expression was found in adipose tissue, stomach 
and kidney from mouse, but not in the small intestine (Yen et 
al., 2002). MOGAT2 and MOGAT3 are both highly expressed 
in the intestine and could play an essential role in intestinal 
dietary fat absorption by re-synthesizing triglycerides from fat¬ 
ty acids and monoacylglycerol. MOGAT2, in parallel to 
DGAT2, is also highly expressed in the liver in humans (Yen 
and Farese, 2003). Recent in vitro experiments have shown 
that MOGAT3 also has a significant DGAT activity (Cheng et 
al., 2003). Expression of MOGAT3 was reported to be re¬ 
stricted to the gastrointestinal tract with the highest level found 
in the ileum (Cheng et al., 2003). 

All members of the DGAT2/MOGAT gene family are high 
priority candidate genes for quantitative traits related to di¬ 
etary fat uptake and triglyceride synthesis and storage in farm 
animals. Moreover, this gene family may also play a key role in 
polygenic diseases in humans such as obesity and type 2 dia¬ 
betes. Blocking DGAT2/MOGAT enzymes in enterocytes 
might prove to be a feasible pharmaceutical target to inhibit 
intestinal fat absorption and therefore to treat obesity in 
humans (Yen and Farese, 2003). 
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Abstract. Myostatin (GDF8) acts as a negative regulator of 
muscle growth. Mutations in the gene are responsible for the 
double muscling phenotype in several European cattle breeds. 
Here we describe the sequence of the upstream 5 7 region of the 
myostatin gene. The sequence analysis was carried out on three 
animals of nine European cattle breeds, with the aim to search 
for polymorphisms. A T/A polymorphism at -371 and a G/C 
polymorphism at -805 (relative to ATG) were found. PCR- 


RFLP was used to further screen 353 animals of the nine breeds 
studied and to assess the frequencies of the SNPs. The promot¬ 
er region of the gene contains several binding sites for transcrip¬ 
tion factors found also in other myogenic genes. This may play 
an important role in the regulation of the protein and conse¬ 
quently on muscular development. 

Copyright©2003 S. Karger AG, Basel 


Growth differentiation factor 8 (GDF8) or myostatin is a 
member of the transforming growth factor (3 (TGF-p) super¬ 
family, which includes proteins that mediate key events in cell 
growth and development through signal transduction. In the 
absence of myostatin, the skeletal musculature of mice is two 
to three times greater in mass than that of wild-type mice 
(McPherron et al., 1997). The enlarged musculature of the 
myostatin-deficient mice results from effects on both early (hy¬ 
perplasia) and late (hypertrophy) myogenic processes. Recent 
investigations demonstrated that myostatin acts as a negative 
regulator of myogenesis (Lee et al., 2001) and inhibits myoblast 
proliferation during the cell cycle and myogenic differentiation 
(Thomas et al., 2000; Taylor et al., 2000; Rios et al., 2002). 

Several cattle breeds are characterized by double muscling 
phenotype (an increase in the number of normal-sized muscle 
fibers that result in enlarged muscles with deep creases) caused 
by mutations at the GDF8 locus. A large number of alleles of 
the gene have been identified in cattle, most of which are silent 
or neutral in their resultant effect (Grobet et al., 1997; Kam- 
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badur et al., 1997; Karim et al., 2001). Even if the double mus¬ 
cling phenotype is genetically heterogeneous, the extreme phe¬ 
notypes are mostly related to stop codons in the third exon of 
the GDF8 gene (Grobet et al., 1998; Cappuccio et al., 1998). 
However, in some breeds double muscling is not associated 
with any disruptive mutation in the gene. In such cases, the 
study of the upstream sequence of the gene to detect additional 
mutations could help to identify probable causes of double 
muscling features besides those related to coding regions. The 
promoter region of GDF8 has been sequenced in human (Fer- 
rel et al., 1999; Ma et al., 2001), pig (Stratil and Kopecny, 2000) 
and cattle (Spiller et al., 2002). In this study we sequenced the 
promoter region of the gene in nine cattle breeds and report two 
point mutations: (transversion) T-*A at (-371) and G—>C at 
(-805). 


Materials and methods 

Pedigree materials, phenotyping and genotyping 
A total of 353 individuals were sampled from nine European cattle 
breeds: Marchigiana (114), Chianina (40), Romagnola (76), Piedmontese 
(41), Holstein Friesian (19), Italian Red Pied (11), Brown Swiss (13), Belgian 
Blue (29), Limousine (10). For the Marchigiana, Chianina and Romagnola 
breeds, the muscling scores available from ANABIC (National Association of 
Beef Breeders) permitted us to check associations between eventual variants 
in the promoter and in the coding region (Yang et al., 1997). However for the 
other breeds, the phenotype was missing because the biological material was 
collected from the semen market (CIZ, Italy). Mutations causative of double 
muscling have been genotyped by different methods according to their fea¬ 
tures. Simple PCR was used in Belgian Blue (nt821 [dell 1]; Grobet et al., 
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Fig. 1. Sequencing strategy for the GDF8 5' upstream region. 


1997) to look at the 11-bp deletion. PCR-RFLP was used in Marchigiana 
(E291X; Marchitelli et al., 2003) to look at the G/T transversion. Sequence 
analysis was used in Piedmontese and in Limousine to look at the G/A transi¬ 
tion and at the C/A transversion, respectively (C313Y; Kambadur et al., 
1997; F94L; Grobet et al., 1998). Chianina and Romagnola breeds were also 
typed for the specific Marchigiana mutation because of past known genetic 
exchanges with the latter. The Holstein Friesian and Brown Swiss animals 
with conventional non-double muscled phenotype were analysed as con¬ 
trols. 

DNA extraction and PCR amplification 

Genomic DNA was isolated either from fresh whole blood collected in 
lithium-heparin anticoagulant tubes using the Wizard Genomic DNA purifi¬ 
cation Kit (Promega Corporation) or from commercial semen doses using a 
modified Cetryltrimethylammoniumbromide method (Yang et al., 1997). 
Using available sequence information from the promoter regions of GDF8 in 
human (EMBL accession number AC073120), pig (EMBL accession number 
AJ133580, AF093798) and cattle (EMBL accession number AF348479), 
PCR primers 5 / -CCCTACAGAGGCCACTTCAA-3 / and S'-CTCGCTGT- 
TCTCATTCAGATC-3' were designed spanning a region upstream the 5' 
region of the first exon. Inverse PCR amplification (Pang et al., 1997) was 
performed in 50-pl reactions containing 100 ng of genomic DNA, 10 mM 
Tris-HCl (pH 9.0), 50 mM KC1, 1.5 mM MgCl 2 , 100 pM each of the four 
dNTP, 40 pmol of each primer and 2 U of Taq DNA Polymerase (Amersham 
Pharmacia Biotech Inc, Piscataway, NJ, USA) under the following condi¬ 
tions: one cycle 94°C 2 min; 34 three-step cycles 94°C 1 min, 56°C 1 min 
and 72 °C 1 min 30 s followed by a last extension for 10 min at 72 °C. The 
1,380-bp PCR product was purified for sequencing by a GFX PCR Gel Band 
Purification Kit (Amersham). 

Two further primer pairs were designed to produce fragments for PCR- 
RFLP analysis. The primers 5 -CT G AGGG A A A AGC AT AT C A AC- 3' and 
5 / -CCAGCAACAATCAGCATAAATAG-3 / were used to amplify a 561-bp 
fragment while 5 / -GCTCCCAGACCTTACCCCAAATC-3 / and 5'-GTTGA- 
TATGCTTTTCCCTCAG-3' were used to amplify a 730-bp fragment. PCRs 
were carried out by using the procedure described above. The two PCR prod¬ 
ucts were used to perform digestions with Dral and Spel, respectively. 

Sequencing 

Sequencing of the PCR fragment (Fig. 1) was performed by the Sanger 
dideoxy chain termination method using fmol DNA Cycle Sequencing Sys¬ 
tem (Promega Corporation, Madison, WI, USA) with a series of nested prim¬ 
ers. Reaction sequences were analysed by electrophoresis on 6 % denaturing 
polyacrylamide gel using a LI-COR Gene ReadIR 4200 automated sequenc¬ 
er. The sequences were analysed with the Eseq software (Li-Cor Inc., Lincoln, 
USA), and alignments were performed with the BIOEDIT software (Tom 
Hall, Department of Microbiology, NCSU, USA). Sequence homology com¬ 
parisons were carried out using BLAST (Altschul et al., 1997). 

The 5' sequence was examined for consensus regulatory elements using 
the TRANSFAC database (Heinemeyer et al., 1999) and SIGNAL SCAN 
program (Prestridge, 1999, Prestridge, 2000) from BIMAS (Bioinformatics 
and Molecular Analysis Section). For the prediction of the TSS (transcrip¬ 
tional start site), the CPROMOTER software (Zhang, 1998) was used. 

PCR-RFLP analysis 

Dral digestion was carried out in a total volume of 18 pi reaction mixture 
containing 400 ng of amplicon, lx buffer (1 mM Tris-HCl (pH 7.5), 1 mM 


MgCl 2 ,0.01 mg/ml BSA), 2.5 U Dial (MBI Fermentas) at 37 0 C for 1 h. Spel 
digestion was carried out in a total volume of 18 pi reaction mixture contain¬ 
ing 400 ng of amplicon, lx buffer (6 mM Tris-HCl [pH 7.5], 5 mM MgCl 2 , 
50 mM NaCl, 1 mM DTT), 2.5 U Spel (Promega) at 37 0 C for 1 h. The prod¬ 
ucts were analyzed on ethidium bromide-stained 2.5 % agarose gel and allele 
frequencies were estimated. Significant departure from Hardy-Weinberg 
equilibrium was tested by a y} test. 

Statistical analysis 

Data of the breeds having muscularity index (Marchigiana, Chianina and 
Romagnola, 230 subjects) were analysed by ANOVA (SAS software GLM 
procedure) using the following model: y = p + br + ex + pr + ex*pr + e to 
estimate the effect of coding and promoter region haplotype on the index. 
The exon*promoter interaction was computed only for the Marchigiana 
breed (114 subjects), that harbours the E291X mutation at the third exon. 
Symbols are: y = muscularity index, br = breed, ex = exon III genotype, pr = 
promoter genotype, p = overall mean, e = random error, ex*pr = promoter 
and exon III SNPs interaction. Only the Dral polymorphism had a sufficient 
variability to allow for an assessment of the association between haplotype 
and phenotype. 


Results 

Genotype at the coding regions 

All the individuals of Belgian Blue, Piedmontese and Li¬ 
mousine were homozygous for the mutations reported in the 
literature (Grobet et al., 1998). In the Marchigiana breed 75 
individuals were wild type (+/+), 29 were heterozygous (mh/+) 
and 10 were mutated homozygous (mh/mh) (Marchitelli et al., 
2003). All the individuals of the other breeds showed the wild 
type genotype. 

Sequence data analysis 

A 1,270-bp PCR fragment of myostatin promoter was 
sequenced in three animals for each of the nine cattle breeds. In 
Marchigiana we sequenced three individuals for each of the 
possible genotypes. In this breed the region has an A+T content 
of 68.1% and shows several binding sites for transcriptional 
factors as illustrated in Table 1. The ATG translation start 
codon is located at position 1215 of the sequenced region. The 
analysis of the promoter sequence shows a putative TSS at posi¬ 
tion -173 (relative to ATG). BLAST analysis of the myostatin 
promoter sequence showed alignment with ten Homo sapiens 
cDNA clones (BQ432391, BQ710180, BQ678660, BQ432377, 
BQ228155, BQ189605, BQ186457, BQ182268, BQ026840, 
BM988934). In one of these ESTs, a putative TSS was located 
at position -140 relative to ATG. Further, in our sequences we 
detected an A base at position -360 and -971 (relative to ATG 
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Fig. 2. Agarose gel electrophoresis (2.5 %) of Dral digestion of PCR frag¬ 
ment. Homozygous animals T/T are in lanes 2-3, homozygous animals A/A 
are in lanes 4—5—6—8 and heterozygous animals T/A are in lanes 7-9. The 
561-bp amplicon is in lane 1. M is the 100-1,000-bp PCR low ladder (Sigma). 
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Fig. 3. Agarose gel electrophoresis (2.5%) of 
Spel digestion. In lane 2 the genotype G/G is 
represented, in lanes 3-4 the genotype G/C. In 
lane 1 is the 730-bp PCR product. 


Table 1. Binding sites of transcriptional factors in the promoter region of 
myostatin locus (location is relative to ATG translation start codon) 


Factor 

Location 

Signal sequence 

AP-1 

-304 

TGAATCA 

TFIID 

-524; -421; -163; -139 

TAT AAA A 

CBP, CP1NF-Y, CBF 

-206 

CCAAT 

GR 

-1011 

CAGAG 

Spl 

-99 

GGGCAG 

GATA-1 

-1005 

aGATAActaca 

Enhancer (site) 

-977 

TTTCCA 

MyoD 

-1185 

CAGGTG; 

WAPUS6 

-435;-756;-374 

TTTAAA 

LBP-1 

-806 

WCTRG 

NF-1 

-744 

TCCA 

Pitl 

-372 

TAAAT 

NF-l/L 

-120 

TGGCA 

CACCC-binding factor 

-294; -1072 

CACCC 

CAP-site 

-516;-488 ;-350 ;-299 

CANYYY 

MEF-2 

-584 

YT AW A A AT AR 

E-boxes 

-153;-186; -308;-543;-776; 
-800; -1167; -1186 

CANNTG 


translation start codon) compared to a C base and G base 
detected in other published bovine sequence (Accession num¬ 
ber AF3484479). PCR-RFLP tests with Taql (T/CGA) and 
AIul (AG/CT) on 70 animals of all the nine breeds included in 
this study confirmed the presence of nucleotide A (data not 
shown). 

Polymorphism/alleles frequencies 

Sequences data analysis revealed a single nucleotide poly¬ 
morphism at position -371 (relative to ATG start codon) in one 
animal of the Belgian Blue breed. This mutation is character¬ 
ized by T/A transversion that introduces a site for Dral or 
Ahall restriction enzyme (TTTTAA —> TTTAAA), allowing 


the design of a simple PCR-RFLP test to confirm sequence 
data results. Digestion of PCR fragment (561 bp) with Dral 
resulted in fragment lengths of 426, 73, 62 bp for genotype T/T, 
365, 73, 62, and 61 bp for genotype A/A and 426, 365, 73, 62, 
and 61 for genotype (T/A) (see Fig.2). Analysis of an additional 
352 animals from the nine breeds showed 248 T/T, 22 A/A and 
82 T/A individuals. The frequencies of the two alleles in the 
studied breeds are given in Table 2. 

The frequency of the A allele was particularly higher (0.793) 
in the Belgian Blue than in the other breeds. In Italian Red 
Pied, Limousine, Piedmontese and Holstein Friesian the fre¬ 
quency was particularly low (range 0.045 to 0.079). Homozy¬ 
gous AA individuals were present only in the Belgian Blue (18), 
Marchigiana (4) and Brown Swiss (1). 

Comparison of our consensus sequence with the bovine 
sequences from the database (Accession no. AF348479), 
showed a G/C point mutation at position -805. This mutation 
introduces a Spel restriction site (AGTAGT->ACTAGT). 
PCR-RFLP test by digestion of 730 bp amplicon with Spel pro¬ 
duced fragments of 567, 163 bp in homozygous animals (G/G) 
and 567, 409, 163, 158 bp in heterozygous animals (G/C) (see 
Fig. 3). Analysis of the 353 animals from the nine breeds 
showed 320 G/G homozygotes and 33 G/C heterozygote but no 
C/C homozygotes. The frequencies of the two alleles in the 
studied breeds are given in Table 3. The frequency of the C 
allele is low (range 0-0.091) in most analysed cattle breeds. 
Only in the Chianina breed it reaches a value of 0.175. The 
Brown Swiss and Limousine breeds are monomorphic for the 
G allele. The alleles frequency fits the Hardy Weinberg equilib¬ 
rium. The frequencies of the polymorphisms considered to¬ 
gether (Table 4) range from 0.025 for the rare (T/A_G/C types) 
to 0.630 for the most frequent genotypes (T/T_G/G). However, 
some breeds have a very low frequency of the latter (e.g. Belgian 
Blue 0.034), while the former never exceed 0.125 (Chianina). 
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Table 2. Genotypes and frequencies of Dral alleles 


Breeds Individuals Genotype Allelic frequencies 




T/T 

T/A 

A/A 

Allele T 

Allele A 

Belgian Blue 

29 

1 

10 

18 

0.207 

0.793 

Brown Swiss 

13 

10 

2 

1 

0.846 

0.154 

Chianina 

40 

29 

11 


0.863 

0.138 

Holstein Friesian 

19 

16 

3 


0.921 

0.079 

Limousine 

10 

9 

1 


0.950 

0.050 

Marchigiana 

114 

83 

27 

4 

0.846 

0.154 

Italian Red Pied 

11 

10 

1 


0.955 

0.045 

Piedmontese 

41 

36 

5 


0.939 

0.061 

Romagnola 

76 

54 

22 


0.855 

0.145 


Table 3. Genotypes and frequencies of Spel alleles 


Breeds 

Individuals Genotype 

G/G G/C 

Allelic frequencies 

Allele G Allele C 

Belgian Blue 

29 

28 

1 

0.983 

0.017 

Brown Swiss 

13 

13 

0 

1 

0 

Chianina 

40 

26 

14 

0.825 

0.175 

Holstein Friesian 

19 

18 

1 

0.974 

0.026 

Limousine 

10 

10 

0 

1 

0 

Marchigiana 

114 

112 

2 

0.991 

0.09 

Italian Red Pied 

11 

9 

2 

0.909 

0.091 

Piedmontese 

41 

39 

2 

0.976 

0.024 

Romagnola 

76 

65 

11 

0.928 

0.072 


Table 4. Genotype frequencies of both SNPs in the analysed breeds 


Breed 

Individuals 

A/A_G/G 

T/A_G/C 

T/A G/G 

T/T G/C 

T/T G/G 

Belgian Blue 

29 

0.621 

0.034 

0.310 

0 

0.034 

Brown Swiss 

13 

0.077 

0 

0.154 

0 

0.769 

Chianina 

40 

0 

0.125 

0.150 

0.225 

0.500 

Holstein Friesian 

19 

0 

0 

0.158 

0.053 

0.789 

Italian Red Pied 

10 

0 

0 

0.091 

0.182 

0.727 

Limousine 

11 

0 

0 

0.100 

0 

0.900 

Marchigiana 

114 

0.035 

0.009 

0.228 

0.009 

0.719 

Piedmontese 

41 

0 

0 

0.122 

0.049 

0.829 

Romagnola 

76 

0 

0.026 

0.263 

0.118 

0.592 

All samples 

353 

0.064 

0.025 

0.210 

0.070 

0.630 


Table 5. Least squares means from ANOVA 
analysis of muscularity index by SNPs at the pro¬ 
moter and at exon III in the Marchigiana breed 
(114 subjects). Different letters mean P< 0.05. 


Exon III 

Dral 

Muscularity value 
LSMEAN 

Standard 

error 

wild 3 

TA 

109.651671 

5.51 

wild 3 

TT 

106.216529 

3.97 

mh/+ a 

TA 

105.947043 

11.41 

mh/+ b 

TT 

130.558679 

5.90 


Association ofhaplotype with muscularity index 

No statistically significant differences at the promoter re¬ 
gion were observed among genotypes of all breeds with pheno¬ 
typic scoring (data not shown). The Marchigiana breed allowed 
for a combined two factors analysis of the promoter and the 
third exon. A significant difference between individuals bear¬ 
ing the combination mh/+ at the third exon and TT at the Dral 
site, versus +/+ at exon III and T/A or A/A at Dral was observed 
(Table 5), with an index about 30% higher in the former group. 
The alternative promoter polymorphism could only be evaluat¬ 
ed in mh/mh individuals, because they were invariant at the 
promoter (T/T_G/G) due to the perfect linkage disequilibrium 
present around the GDF8 locus (Marchitelli et al., 2002). 

Discussion 

Double-muscling phenotype is characterized by lower pro¬ 
portion of bone, increased muscling, reduced fat content, 
reduced marbling, increased meat tenderness, increased birth 
weight and increased dystocia than conventional cattle (Arnold 
et al., 2001). Myostatin is involved in double muscling as it 
functions like a negative regulator of the muscle cell growth, 
thus inhibiting myoblast proliferation and differentiation 
(Thomas et al., 2000; Langley et al., 2002). Some mutations in 
the gene cause malfunctioning, leading to the “double-mus¬ 
cled” phenotype in cattle (Kambadur et al., 1997; McPherron 
et al., 1997; Grobet et al., 1998; Karim et al., 2001). In some 
cattle breeds double muscling is not associated with a function¬ 


al mutation (Grobet et al., 1998). For example, in the South 
Devon breed, animals homozygous for the nt821(dell 1) muta¬ 
tion do not present a double-muscled phenotype (Smith et al., 
2000). This could be attributed to additional mutations either 
outside the coding region (intron or promoter) or at other chro¬ 
mosomal regions segregating with myostatin alleles that in¬ 
fluence double-muscling. 

In this work we isolated, sequenced and partially character¬ 
ized the promoter region of the myostatin gene GDF8. Two A 
bases were detected at -360 and -971 nucleotide position and 
were confirmed by two PCR-RFLP analysis. Therefore we sus¬ 
pect an occurrence of sequencing errors in the published 
sequence (Accession no. AF3484479). 

The two polymorphisms (a T/A transversion at -371 and a 
G/C transversion at -805, relative to ATG translation start 
codon) were used for a PCR-RFLP test to screen 353 animals 
from nine breeds. 

ATaA transversion in the same position has been pre¬ 
viously identified in the promoter region of porcine GDF8 
(Stratil and Kopecny, 1999). A recent study in pig analysed the 
relationship between the mutation and growth traits, showing 
that individuals with TA genotype had a higher average daily 
gain than those with TT genotype (Jiang et al., 2002). We found 
an effect of this polymorphism on muscularity only in the Mar¬ 
chigiana breed and only if individuals carried also a mutation 
in the coding region. However, the phenotypic traits examined 
in the two studies are quite different though related to meat 
production. Considering the two polymorphisms together, the 
most common haplotype is T/T_G/G. We hypothesize that this 
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is the ancestral haplotype because it is present also in the 
human published sequence and is largely widespread in a vari¬ 
ety of cattle breeds. The almost unique presence of the A/ 
A_GG haplotype in Belgian Blue animals may be due to a 
strong founder effect. 

A direct association between the promoter polymorphisms 
and the phenotype was not found. It must be noted that the 
muscularity index used by us is computed on the overall mus¬ 
cularity of the animals. It is well known (Arthur, 1995) though, 
that myostatin disruptive mutations mostly affect the hind¬ 
quarters of the animals and this trait only partially contributes 
to the muscularity index. Moreover, it has been demonstrated 
in other genes, for example the p 2 -adrenergic receptor gene that 
the unique interactions of multiple SNPs within a haplotype 
are associated with a particular biological response, while the 
single polymorphisms, considered separately, may not show 
any association (Drysdale et al., 2000). In our dataset the com¬ 
bination of mh/+ at the third exon and TT at the Dral site 
showed a significant difference of the phenotype versus the oth¬ 
er genotypes, suggesting a regulatory role for that promoter 
region. 

The analysis of the myostatin promoter for regulatory ele¬ 
ments shows putative transcription factor binding motifs that 


are common with other muscular differentiation and develop¬ 
ment genes expressed in muscle cells, suggesting a common 
mechanism of transcriptional regulation (Spiller et al., 2002). 
The polymorphisms found by us are located within three tran¬ 
scriptional factor-binding motifs and could affect the binding 
of these factors, controlling the myostatin expression. The C 
allele at Spel site creates an LBP-1 binding motif at -806 (signal 
sequence WCTRG). The A allele at the Dral site generates a 
Pitl (signal sequence TAAAT) and a WAP-US6 (signal se¬ 
quence TTTAAA) binding motifs at -372 and at -374 respec¬ 
tively. 

The knowledge of the possible variants at the promoter 
region of the myostatin gene shown in this study helps elucidate 
the regulatory mechanisms of the gene in mammalian species. 
The observations have economical implications because the 
findings could lead to the development of selection plans that 
can be useful both when the double muscling trait is considered 
advantageous (breeding in controlled conditions), and when 
the trait should be avoided (range breeding). Since myostatin 
plays a role also in breeds not carrying double muscling, the 
variants occurring at the promoter region can be useful also to 
fine tune selection schemes aimed to obtain a desired grade of 
muscularity. 
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Abstract. Thirty-eight bovine Y chromosome (BTAY) mi¬ 
crosatellites (MS) were assessed for polymorphisms in DNA 
samples obtained from 17 unrelated bulls. Thirty-three of these 
microsatellites are new and were used for the construction of a 
first generation radiation hybrid map for BTAY (Liu et ah, 
2002). Five MS had been previously reported and were used as 
positive controls. Fourteen out of 38 MS were found to be poly¬ 
morphic; the remaining 24 were uninformative among the ani¬ 
mals tested. The number of hemizygous loci per MS within 
individual ranged from two to over 20. Seven MS presented 
smear- or ladder-like bands, a unique feature for Y chromo¬ 


some multi-copy hemizygous MS loci. The locus length vari¬ 
ance, within individual, ranged from 2 to 42 bp corresponding 
to the MS with the minimum and maximum number of loci 
observed, respectively. Within the 14 polymorphic MS, the five 
pseudoautosomal MS, on average, were more polymorphic 
(35.3%) than the nine Y-specific MS (19.6%). Haplotypes 
resulting from combinations of these polymorphic loci will pro¬ 
vide a powerful tool for future studies on the origin of domestic 
cattle and the evolution of bovid species. 

Copyright©2003 S. Karger AG, Basel 


Y chromosome-specific microsatellites (MS) are of special 
interest because they are haploid and paternally inherited. The 
lack of recombination in the Y-specific region, which consti¬ 
tutes about 95 % of the Y chromosome, implies that the Y chro¬ 
mosome is inherited “en bloc” as a haplotype. The value of 
marker polymorphisms in the Y chromosome for human evolu¬ 
tionary studies has been largely recognized and used not only in 
studies of human evolution (Dorit et al., 1995; Hammer 1995; 
Hammer et al., 2001; Ruiz et al., 1996; Kayser et al., 2001) but 
also in studies of forensic genetics (Gill et al., 2001). However, 
similar investigation has not been carried out efficiently in 
farm animals because of their less developed Y chromosome 
maps and the lack of Y-specific polymorphic markers. Among 
all markers (~ 40) reported so far on the bovine Y chromosome 
(BTAY), only four have been found to be polymorphic in cattle 
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and related bovid species (Hanotte et al., 1997; Edwards et al., 
2000). Using chromosomal microdissection and microcloning 
techniques, Ponce de Leon (1996) constructed a BTAY-specific 
DNA library. By screening the BTAY library, 33 new MS have 
been recently developed and mapped to BTAY (Liu et al., 
2002). In the present study, the newly developed MS have been 
assessed for polymorphisms in cattle. 

Materials and methods 

Microsatellites and DNA samples 

The BTAY library screening and MS development are detailed in Liu et 
al. (2002). A total of 38 MS (Table 1), including 33 MS that were identified 
from our BTAY-chromosome library, and five BTAY MS previously 
reported, shown to be polymorphic in a range of bovid species (Edwards et 
al., 2000), were assessed for polymorphisms in 17 unrelated males and one 
female from six breeds and crossbreds of domestic cattle (Table 2). 

Genotyping 

PCR was performed in a 96-well Thermo HYBAID MBS-thermocycler. 
Prior to genotyping, PCR conditions for each pair of MS primers were optim¬ 
ized by testing over a range of annealing temperatures (50-64 °C) using a 
gradient program and one male and one female bovine DNA sample. Each 
PCR reaction contained 20 ng of DNA, lx buffer (TaKaRa, Biomedicals), 
2 mM MgCl 2 , 3 pmol of each primer and 0.15 U TaKaRa Ex Taq DNA 
polymerase in a total volume of 12 pi. The concentration of dNTPs was 
200 pM, with the exception of dATP, which was reduced to 20 pM. Instead, 
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0.3 pmol of [a- 33 P]-dATP was added to each reaction. Amplification cycle 
included denaturation at 95 ° C for 5 min, then followed by 36 cycles at 94 ° C 
for 30 s, 52-64 °C for 30 s, 72 °C for 30-45 s, and a final extension step at 
72 °C for 5 min. The radio-labeled PCR products were denatured at 95 °C 
and electrophoresed through 7 % polyacrylamide sequencing gels. After auto¬ 
radiography, allele sizes were determined by comparing amplified fragments 
to internal size markers of the vector M13mpl8 sequencing reaction (se¬ 
quence V2.0 DNA sequencing kit, USB Corporation). 


Results and discussion 

Locus numbers ofY chromosome microsatellites 
By screening the BTAY-specific library with a (CA)i 2 oligo 
probe, 33 new MS were developed, six of which mapped to the 
PAR, and 27 to the Y-specific region by RH mapping (Liu et 
al., 2002). All new MS, together with five previously reported 


Y-specific MS that were used as positive controls in the present 
study, were genotyped in a group of 17 unrelated bulls and one 
female. 

In describing our results it is important to point out that for 
the PAR where genetic recombination between BTAX and 
BTAY occurs, a gene locus or gene loci are interpreted to be the 
same as for somatic chromosomes. However, when referring to 
the BTAY-specific region each locus is hemizygous and there¬ 
fore, alleles of a hemizygous locus can only be assessed when 
compared across individuals (males) in the population. Like¬ 
wise, in the BTAY-specific region a gene locus can have multi¬ 
ple copies distributed along the chromosome, each copy being a 
hemizygous locus and therefore constituting a multicopy hemi¬ 
zygous gene. The latter when analyzed by PCR could yield a 
single band, indicating lack of sequence variation and making it 
difficult to conclude if the hemizygous gene is multicopy, or 


Table 1. Details of the bovine Y chromosome-specific microsatellite loci 


Marker 

Primers (5’-3’; Forward/Reverse) 

Type of repeat a 

Number of 
loci/individual b 

Size range 
of loci (bp) 

Anneal 
Tm (°C) 

GenBank 
accession no. 

UMN0803 

GATCCACATCCCCCTCAC 

CTGCTTCTCTTGTCCGCTAA 

(CA) 4 . 6 ccctc- 

ACACAA) 6 

4 

269-308 

60 

AF483745 

UMN2908 

GGACT GAAGCGAGTTAGCAC 
CACATCCCTGCTCACACACG 

(TG) 4 N 4 3 (TG) 7 

7 

175-185 

58 

AF483747 

UMN0905 

ATCAACCGTGGTAGCTCTAA 

CT AGAATGTAAACCAGCTGC 

(CA) 16 

12 

160-174 

61 

AF483748 

UMN0929 

ACCAGCT GATACACAAGTGC 

GGT CAGAGAATGAAACAGAG 

(CA) 19 

10 

176-197 

61 

AF483749 

UMNO 108 

GAT CCATCCACATTGCTCCA 
CCAAGCGTCCATCAATTTAC 

(TG) 18 

7 

57-64 

189-250 

60 

AF483744 

UMN2008 

CAAGCATATCAGTGGCCTGG 

GCTGCAAGGAAACTATTTCA 

(CA) 2 GA(CA)„G- 

(CA) 3 

6 

133-141 

61 

AF483746 

UMN3008 

TTGTGGAGGACTATTCATGG 

TCTGGACTCGACAGGACACC 

(TG) 17 

>16 

172-214 

56 

AF483755 

UMN0307 

GAT AC AGCT GAGT GACTAAC 
GTGCAGACATCTGAGCTGTG 

(CA) 18 

12 

101-162 

58 

AF483750 

UMN1203 

AACCAGTTGCGCACTCACCA 

AGGCGACTTGTTCACAAGGT 

(CA) 13 

>4 

102-108 

226-130 

58 

AF483751 

UMN0907 

CT GTT GAT ACTTT CTT CCTG 
CTGATGGACATCTGATATTC 

(TG) 21 (TTTA) 3 

2 

140, 145 

55 

AF483775 

UMN1514 

CTTCCTGAGAGTGTTCCAGT 

TATTCACAAGGCCTCTGGAC 

(TG) 15 . 21 (TTTA) 3 

6 

112-117 

54 

AF483752 

UMN2303 

TACTTGCTTGAGACTTACTG 

T GT G A AC AC AT CT GATT CTG 

(TG) 17 

>20 

98-132 

56 

AF483753 

UMN2001 

TCAGGCAAGACTACTGGAGC 

TACCCTGGCGATTCTGCAA 

(CA) 18 

2 

131, 134 

54 

AF483754 

UMNO 103 

AC AC AG AGT ATT C ACCT GAG 
ATTT ACCT GGGT C AA AGC AC 

(CA) 22 

7 

124-136 

60 

AF483757 

UMN0301 

GCCTTGGCTAGTGCGCAACC 

CAAAACTGTTGCACTGTTTC 

(TG) 4 TA(TG) 6 AA- 

(TC) 2 (TG) 4 

7 

100-106 

60 

AF483756 

UMN0304 

T GAT ATT C AC A AGGCCGCTG 
GGCTGTGGT AT ACT AT GG AG 

(TAAA) 3 TA(CA) 16 

>10 

210-232 

55 

AF483758 

UMN0311 

GTT G AGGT CTCTT GC AT CT G 
CCAACAATGCTTCATCCTTC 

(TG) 17 

2 

152, 156 

56 

AF483759 

UMN0406 

GTTGAGGACTCTTGCATCTG 

TGCTTCATCCTTCATTCCAC 

(TG) 18 

14 

140-172 

56 

AF483760 

UMN0504 

AGGCCATCTGCATAGTGAAG 

TGCTGGACTGCTCATCTCTG 

(CT) 2 GT (CT) 3 GT 2 

4 

106-144 

56 

AF483761 

UMN0705 

(TSPY) 

T AT AGCT GAGACACGT GAGT 
CTCCTCCAGACCTGTGTATG 

(CA)n(TACA) 2 C- 

ACGTA(CA)i 3 . 19 

Smear 

180-210 

64 

AF483762 

UMN0920 

GTTGAGGACTCTTGCATCTG 

CACAGGCCTAGAAGATTGAG 

(TG)„. 22 

>16 

254-290 

61 

AF483763 

UMN1113 

ACAGCACTTCTTAACAAAGC 

TAGCCACACATCATGTTC 

(CA) 10 TA(CA) 4 

7 

124-140 

58 

AF483764 

UMN1201 

TGCTTCATCCTTCATTCCAC 

TTGTTGAGGACTCTTGCATC 

(CA) 17 

4 

48-49 

390-398 

58 

AF483765 


54 


Cytogenet Genome Res 102:53-58 (2003) 



Table 1 (continued) 


Marker 

Primers (5’-3’; Forward/Reverse) 

Type of repeat a 

Number of 
loci/individual b 

Size range 
of loci (bp) 

Anneal 
Tm (°C) 

GenBank 

accession no. 

UMN1307 

G AGTT GAT AT CTT GT GGC AG 

ACAT GT CAAGGT CACACACC 

(TG) n A(TTTA) 2 

4 

185-190 

60 

AF483766 

UMN1605 

CCACACTGAACTGCCTAACT 

A ACT AT CT GCT A AGG ATT GG 

(CA) 5 N 5 (CA) 2 (TA) 2 

N 4 (CA) 6 N 3 (CA) 8 

2 

173, 177 

54 

AF483767 

UMN2102 

TCACCTTCCTTGTGTCATCTG 

G AAC AAG ACCGC ATT GCC A 

(TG)„. 22 

3 

158-194 

56 

AF483768 

UMN2404 

GGT AC A ATTG A A A AT AT G 
TGTACCTACACTGATATGTT 

(CA) 18 N,o(TA) 8 

>15 

85-112 

54 

AF483769 

UMN2405 

CCTGCCATCCATTGTGAAGA 

CTGCTTACCTGGTCAGGATT 

(CA),6 

12 

140-176 

55 

AF483770 

UMN2611 

CCAGACTTGTGTATGCTGAG 

CT AAA A AT AACT G AC AT GGG 

(CA) n TAAA- 

(CATA) 2 (CA) 5 

3 

155, 157, 
160 

58 

AF483771 

UMN2706 

TT GTT G AGG ACT CTT GC AT C 
CCACATATCAGGCAAAGTCAT 

(TG) 18 

>20 

109-150 

54 

AF483772 

UMN2713 

GTACCTACACTAATATGTTCA 

CCAAAGAAAGTTCAGGTACA 

(TG) 25 

>20 

94-124 

54 

AF483773 

UMN2905 

CAGATAAGAATGC GAAAGT C 

GC AAAGT GT C AGCTT AAACC 

(CA) 18 

5 

107-119 

56 

AF483774 

UMN0910 

AACCAGATTCTTTGTAGCCA 

CCT GTTTCCT AGAGACT GT G 

(CA) 13 CT(CA) 5 

5 

163-243 

58 

AF483776 

INRA057 

(DYS4) 

CCTAGCGACTGTCCAAGCG 
CACGGGCT GAG AATTC AAAC 

(TG) 2 TA(TG) 10 

6 

122-132 

58 

X71501 

INRA124 

(DYS6) 

GAT CTTT GC AACT GGTTT G 
AGGACACAGGTCTGAGAATG 

(GT) 4 A(TG)o 

12 

58-67 

126-190 

56 

X71546 

INRA126 

(DYS7) 

TCT AG AGG AT C A AGG ATTT GT G 

A AT CCAT GGAAAGATGCACT G 

(TG)„ 

8 

58-60 

242-248 

54 

X71553 

INRA189 

TTTTGTTTCCCGTGCTGAG 

GAACCTCGTCTCCTTGTAGCC 

(TG) 22 

7 

43-44 

148-156 

58 

X73941 

BM861 

(DYS8) 

TTGAGCCACCTGGAAAGC 

C AAGCGGTT GGTT C AG AT G 

(GT) 6 C(TG) 10 

6 

135-192 

56 

AF4837 


a Represents the size of repeats in the original clone; N indicates the number of non-CA/TG nucleotides. 

b Loci were determined by the number of PCR amplified fragments among the individuals studied in this work. The symbol “>” was used when 
a smear existed or when there were too many bands to be counted precisely. 


Table 2. BTAY haplotype analysis 3 


Animal ID b 

Breed 

UMN0803 

UMN2908 

UMN0905 

UMN0929 

UMN0108 

UMN3008 

UMN0307 

UMNO 5 04 

UMN0103 

UMN2404 

UMN2001 

UNN0920 

UMN2303 

INRA189 

Polymorphic 
rate (%) 

1 BT-061 

Brangus 

1 

1 

1 

2 

1 

1 

2 

1 

1 

1 

1 

1 

0 

1 

14.3 

2 BT-111 

Angus 

1 

1 

2 

2 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

14.3 

3 BT-1684 

Crossbred 

1 

1 

1 

1 

1 

2 

1 

1 

1 

2 

2 

1 

1 

2 

28.6 

4 BT-1691 

Crossbred 

2 

1 

1 

1 

2 

2 

2 

1 

1 

2 

2 

1 

2 

2 

57.1 

5 BT-1703 

Crossbred 

1 

1 

1 

2 

2 

2 

2 

2 

1 

2 

1 

1 

1 

2 

50 

6 BT-1720 

Crossbred 

1 

0 

1 

1 

2 

2 

2 

0 

1 

2 

2 

2 

1 

2 

50 

7 BT-1754 

Crossbred 

2 

1 

1 

1 

1 

1 

2 

1 

1 

2 

2 

1 

1 

2 

35.7 

8 BT-1765 

Hereford 

1 

2 

2 

1 

1 

1 

2 

2 

1 

1 

1 

1 

1 

1 

28.6 

9 BT-1769 

Angus 

1 

2 

1 

2 

2 

1 

1 

1 

1 

1 

1 

1 

1 

1 

21.4 

10 BT-1759 

Crossbred 

1 

0 

2 

2 

1 

1 

1 

1 

2 

1 

1 

1 

1 

1 

14.3 

11 BT-1761 

Crossbred 

1 

1 

2 

2 

2 

1 

1 

1 

0 

1 

1 

1 

0 

1 

21.4 

12 BT-1763 

Crossbred 

1 

2 

1 

2 

2 

1 

1 

2 

2 

2 

1 

0 

1 

1 

42.7 

13 M4 

Simmental 

1 

1 

1 

1 

2 

1 

1 

1 

2 

1 

1 

1 

1 

1 

14.3 

14 M22 

Simmental 

1 

1 

1 

2 

2 

1 

1 

2 

2 

1 

1 

1 

1 

1 

28.6 

15 M44 

Simmental 

1 

1 

1 

2 

1 

1 

1 

1 

1 

1 

2 

1 

1 

1 

14.3 

16 M60 

Simmental 

1 

1 

2 

2 

2 

1 

1 

2 

1 

1 

1 

1 

1 

1 

28.6 

17 M72 

Simmental 

1 

1 

1 

0 

2 

1 

1 

1 

2 

1 

1 

1 

1 

1 

14.3 

18 F2410 

Holstein 

1 

1 

1 

2 

3 

0 

0 

3 

0 

3 

0 

0 

3 

0 


Polymorphic rate (%) 

12 

18 

29 

59 

59 

24 

35 

29 

29 

35 

29 

5.9 

5.9 

29 



"0" represents no PCR products; "1" the most common loci among animals genotyped; "2" Loci that are different from the most 
common loci; "3" sizes of PCR products amplified from female DNA differ from that amplified from male DNA. 

1 Animals #1-17 are males, and #18 is a female. 
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Fig. 1. Genotyping results from eight BTAY microsatellites. Eighteen animals (0*1-17, 9 18) are listed at the top of each 
column. Three PAR markers (UMN0803, UMN0905 and UMN0929) displayed the same size of PCR products in both male and 
female animals. The remaining five microsatellites are BTAY specific with UMN2001, UMN2303 and UMN2404 amplifying 
either very weak or different size bands in female DNA. UMN0307 demonstrates the amplification of two groups of PCR products 
from the same marker. UMN2303 shows smear-like bands and UMN2404 the typical ladder-like bands. UMN2611 is uninforma¬ 
tive among the animals tested, and is presented here to demonstrate the quality of the DNA samples. Band sizes are indicated at 
the right side for each autoradiogram. 


multiple bands of different sizes for each individual (male) 
sample resulting from variation in the size of the MS repeat at 
each locus. The latter provides proof that the gene in question is 
a multicopy gene. Hence, for the BTAY-specific region there is 
variation in size of amplification products (each product repre¬ 
senting a locus) within individual and allelic variation at each 
locus across individuals. 

Genotyping results (Table 1 and 2) showed that the number 
of loci found for a given MS within individual varied from 2 to 
over 20 (Table 1, Fig. 1) with the exception of UMN0705, 
which produced a smear that prevented us from identifying 
specific loci or loci number. In addition to UMN0705, 6 mic¬ 
rosatellites, UMN2303, UMN0304, UMN0920, UMN2404, 
UMN2706 and UMN2713, presented numerous ladder-like 
bands. Hence, their loci number could not be counted pre¬ 
cisely. A typical ladder-like (UMN2404) or smear-like band 
(UMN2303) is shown in Fig. 1. The MS locus length variance 
within individual samples ranged from 2 bp (UMN2001, 
Fig. 1) to 42 bp (UMN2706), corresponding to the minimum 
and maximum number of loci observed, respectively. Sev¬ 
eral microsatellites (UMNO 108, UMN0504, UMN1201 and 
UMN0307, Fig. 1) amplified two groups of PCR products per 
individual sample. So, their locus length variance, within indi¬ 
vidual, could be larger, from 38 bp (UMN0504) to 250 bp 
(UMN1201). By DNA sequence analysis (data not shown), we 
have found that the UMN0705 sequence is related to the 
bovine TSPY gene family which has been reported to be spread 
over the entire BTAY (Mattews and Reed, 1992; Vogel et al., 


1997a, 1997b). Based on these observations, we conclude that 
the smear or smear-like bands of PCR products obtained from 
a given MS, as mentioned above, most likely is generated by the 
large number of multicopy hemizygous loci existing along the Y 
chromosome of each of the sampled bulls and not the result of 
unspecific PCR amplifications. 

Unlike common MS that are usually single copy, many of 
the Y-specific MS have been found to be multi-copies (Liu et 
al., 2002). There are several ways to determine a locus copy 
number in a genome. The easiest way we found is by RH typ¬ 
ing. Usually, a multi-copy BTAY MS showed a retention fre¬ 
quency of over 55% in the 7,000-rad cattle-hamster WG-RH 
panel (Liu et al., 2002). However, the high retention frequency 
does not always mean high locus copy number. For example, 
UMN2611 and UMN2404 have a retention frequency of 64.8 
and 56.3 % (Liu et al., 2002), with a locus copy number of 3 and 
15, respectively (Table 1 and Fig. 1). 

BTA Y microsatellite polymorphisms 

Out of 38 MS genotyped, 14 were found to be polymorphic 
across individuals (Table 2). The remaining 24 were uninfor¬ 
mative among the animals tested. As shown in Table 2, differ¬ 
ent markers have different levels of polymorphism. The most 
informative MS was UMN0929 with four different genotypes 
found in the 17 bulls (Fig. 1). When comparing the MS between 
the PAR and the Y-specific region, we found that all PAR MS 
except UMN2008 were polymorphic (5/6), while less than one- 
third of the 32 Y-specific MS were polymorphic (9/32). Within 
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the 14 polymorphic MS, 5 PAR MS (UMN0803, UMN2908, 
UMN0905, UMN0929 and UMNO 108), were more polymor¬ 
phic (35.3%), on average, than the nine Y-specific MS (19.6%). 
These results seem to be reasonable because the PAR on BTAY 
and BTAX recombine during meiosis, and therefore, the region 
is prone to generate more polymorphisms. In human, it has 
been found that the male recombination frequency of MS typed 
in the PARI (the main portion of the human PAR) is 10-20 
fold higher than the female’s (Weissenbach et al., 1987; Wolf et 
al., 1992). It remains to be investigated whether the bovine 
PAR loci have a higher polymorphic rate as compared to mark¬ 
er loci on autosomes. 

Besides the six PAR MS, 21 BTAY-specific MS amplified 
PCR products from the Holstein female DNA sample (Ta¬ 
ble 2). In most cases, PCR products obtained from female 
DNA samples differed from male amplification products in 
size and amount of product as determined by ethidium bro¬ 
mide band brightness. Three MS (UMN1605, UMN2405 and 
INRA126) also reported to be Y chromosome-specific ampli¬ 
fied exactly the same size PCR products from both male and 
female DNA. It has also been reported that the INRA126 mark¬ 
er identified PCR products in the same size range in six female 
and six male yaks (Edwards et al., 2000). There are at least three 
possible explanations as to why a Y-specific marker amplifies 
the same product from both male and female DNA. One expla¬ 
nation is the common chromosome-pair ancestry giving origin 
to the X and Y chromosomes. As demonstrated from the 
human sex chromosome mapping, the X-specific and Y-spe¬ 
cific regions still share homologous sequences in several areas 
even though these regions do not pair and recombine at meiosis 
(Wolf et al., 1992; Skaletsky et al., 2003). A second explanation 
can be based on the fact that BTAY-specific markers have mul¬ 
ti-copies spread along the chromosome. An example is the 
TSPY related sequence family (BRY) that has approximately 
1,200 copies dispersed all over the Y chromosome and about 
100 copies on the X chromosome (Mattews and Reed, 1992). 
RH mapping has shown that UMN1605 and UMN2405 had an 
extremely high retention frequency of 74.4 and 71.9%, respec¬ 
tively. For these latter two markers we have not been able to 
rule out that few copies might exist in the PAR or other region 
of the X chromosome leading to amplification of similar PCR 
products for male and females. The third possible explanation 
is that BTAY-specific MS primers produce unspecific PCR 
amplification in female DNA samples. As mentioned above, 
faint bands or different size PCR products obtained from 
female DNA samples are probably unspecific amplification 
products. Although, in this study, PCR conditions have been 
optimized, we cannot completely exclude the possibility of 
some unspecific amplification, especially when working with Y 
chromosome repetitive sequences and/or multi-copy markers. 
In our experience, optimization of the PCR condition, anneal¬ 
ing temperature in particular, is crucial to obtain a stable 
amplification pattern for genotyping. 

Four Y-specific microsatellites (INRA124, INRA126, 
INRA189 and BM861), used in this study as positive controls, 
had already shown polymorphisms in different bovid species 
including domestic cattle, bison, mithan, swamp buffalo and 
yak (Hanotte et al., 1997; Edwards et al., 2000). However, only 


one (INRA189) was found to be polymorphic in our test sam¬ 
ples. Our results indicate that the animals used in this study 
were from closely related cattle breeds (or crosses) providing a 
limited picture about Y chromosome polymorphism of the 
entire cattle population. Our results also make us to believe that 
the remaining 24 MS found uninformative in this study, may 
prove to be polymorphic in a larger sample of cattle, in other 
cattle breeds, and in other bovid species. 

Several microsatellites failed to amplify any PCR products 
in few DNA samples, as can be seen in Fig. 1 for UMN2303 
(samples #1 and #11), and in Table 2 for UMN2908 (#6 and 
#10), UMN0504 (#6), UMN0103 (#11) and UMN0920 (#12). 
The latter were confirmed after several PCR amplification 
trials. Since the majority of the typed MS did amplify from all 
test samples, the quality of the DNA samples should not be 
considered as a possible explanation for PCR failure. To find 
out if lack of amplification was due to Y chromosome segment 
deletions, DNA samples were dot blotted and hybridized 
with radiolabeled PCR amplified DNA fragments from MS 
UMN0905, UMN0929 and UMN2905. Dot blot results (data 
not shown) were positive indicating that lack of PCR amplifica¬ 
tion was not due to Y chromosome segment deletions for the 
three MS studied. A possible explanation for PCR amplifica¬ 
tion failure might be a sequence mutation at the priming site 
that needs to be investigated in the future. 

Y chromosome haplotypes 

The 14 polymorphic MS were combined to produce haplo¬ 
types as listed in Table 2. Results indicate that each individual 
sample represented a unique haplotype. Although the test sam¬ 
ple is too small to do a genetic cluster analysis, we could still 
identify haplotype patterns among the animals tested. For 
example, the five crossbred animals (BT-1684, BT-1691, BT- 
1703, BT-1720, and BT-1754) had similar haplotypes (also see 
UMN2404 from Fig. 1), while the haplotypes in two Simmen- 
tal bulls (M4 and M72) were very close as compared to other 
animals (Table 2). Clearly, the polymorphic markers identified 
in this work and their related haplotypes should provide a pow¬ 
erful tool to study the origin and evolution of domestic cattle as 
well as bovid species. 
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Abstract. PCR protocols incorporating fluorescently la¬ 
beled multiplexed primer combinations were developed to pro¬ 
duce a linkage map for bison. Three hundred fifty eight micro¬ 
satellite loci spanning all 29 autosomes were genotyped via 
83 PCR multiplexes and nine individual amplifications. A to¬ 
tal of 292 markers were integrated into an autosomal linkage 
map for bison. The sex averaged bison map (2,647 cM) was ap¬ 
proximately 9% longer than the corresponding USDA MARC 
map, which covered 2,415 cM. Utilizing weaning, yearling and 


17-month weights from two private bison herds, a QTL scan 
was conducted using the developed linkage map. LOD peaks 
suggestive of QTL were identified on chromosomes 2, 7, 15, 
and 24 for weaning weight, chromosomes 4, 14, and 15 for 
yearling weight and chromosomes 8, 14, and 25 for 17-month 
weight. Four of the identified chromosomes have conserved 
synteny with regions harboring growth QTL in cattle. 

Copyright©2003 S. Karger AG, Basel 


Bison once numbered in the tens of millions in North Amer¬ 
ica. However, due to the population bottleneck experienced in 
the late 1800s, their numbers were reduced to not more than 
541 individuals by 1888 (Coder, 1975; Dary, 1989). Almost all 
of the bison alive today can be traced back to five populations 
that were used to repopulate most of the extant public and pri¬ 
vate herds (Coder, 1975). Current semi-wild bison populations 
are fragmented among public parks and sanctuaries throughout 
the United States and Canada. However, more than 90% of the 
bison today reside on private ranches where they are raised for 
meat production. Reduced reproduction and other deleterious 
manifestations associated with a bottlenecked population are 
rare in current bison herds (Berger and Cunningham, 1994). 
The fact that bison went through a severe population bottle¬ 
neck without suffering catastrophic consequences and are in- 
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creasingly becoming a viable participant in the livestock indus¬ 
try makes them a unique species for genetic study and compari¬ 
son to domestic cattle. 

The second-generation bovine linkage map of Kappes et al. 
(1997) contained 1,250 markers with an average marker inter¬ 
val of 2.5 cM. Additional markers have subsequently been add¬ 
ed to this map, further increasing marker density and providing 
a rich resource for genetic studies of bison. However, to date, 
only three studies report microsatellite variation in North 
American bison, two of them on a very limited scale (Mom- 
mens et al., 1998; Wilson and Strobeck, 1999; Ward, 2000). 
Genetic evaluations, whether for the purposes of genome scans 
for economically important genes or conservation biology, 
require genotyping large numbers of markers across a large 
number of individuals. Genotyping microsatellites by radioac¬ 
tive methods can be both time-consuming and costly. While 
microsatellite markers offer the advantage of rapid throughput 
via co-amplification and simultaneous genotyping of multiple 
loci, this feature has rarely been utilized because the optimiza¬ 
tion of multiplex reactions is often more time consuming and 
difficult than the optimization of PCR for individual loci. 
However, using the information amassed during construction 
of the bovine genetic map and fluorescence-based detection, it 
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is possible to greatly reduce the time and cost involved in geno- 
typing. Unfortunately, multiplexing on a sufficient scale has 
not previously been done in cattle or bison for linkage mapping 
or quantitative trait loci (QTL) scans. The availability of a 
panel of multiplexed microsatellite markers for rapid genotyp- 
ing of North American bison and domestic cattle would greatly 
enhance the feasibility and efficiency of initial QTL scans and 
of genetic diversity assessments in these species. 

Numerous QTL regions have been identified in cattle for 
traits such as weight and carcass characteristics (Elo et al., 
1999; Stone et al., 1999; Grosz and MacNeil, 2001; Kim et al., 
2003). Most of these reports utilized experimental crosses to 
produce a mapping population by crossing lines with large dif¬ 
ferences in mean phenotype. Forming experimental crosses 
between divergent lines offers advantages of an increased likeli¬ 
hood of segregating QTL, increased power to detect QTL and 
ease of analysis, but it also has a very important limitation. 
These crosses are designed to detect QTL which are fixed, or 
nearly fixed for alternate alleles between the parental breeds. 
Thus, the identified QTL are likely to not be segregating in the 
founder populations that were crossed (Haley 1999). Converse¬ 
ly, mapping within commercial populations offers the advan¬ 
tage that specific crosses need not be made and thus pedigrees 
and phenotypes can quickly be collected. Additionally, any 
QTL identified within a commercial population may imme¬ 
diately be incorporated in the breeding program for that popu¬ 
lation. However, these advantages come at the cost of de¬ 
creased power of QTL detection and an increase in the compu¬ 
tational complexity required to appropriately analyze the phe¬ 
notypes. 

Materials and methods 

DNA source 

Two private bison herds were used as the source of animals for the map¬ 
ping and QTL analysis. The Arrowhead Buffalo Ranch herd (ABR) located 
in Canton, Ohio contained 173 total individuals, 49 parents (43 females, 6 
bulls) and 124 offspring (57 females, 67 males). The Hidden Hollow Preserve 
herd (HHP) located in Bradfordsville, Kentucky contained 40 total individu¬ 
als, 12 parents (11 females, 1 bull) and 28 offspring (15 females, 13 males). 
The herds were not known to be related. Genomic DNA was isolated from 
tail hair follicles or white blood cells by proteinase K treatment followed by 
phenolxhloroform extraction (Sambrook et al., 1989) or by using the SUPER 
QUICK-GENE DNA Isolation kit (Analytical Genetic Testing Center, 
Denver, Colorado). 

Microsatellite loci 

A total of 358 microsatellite primers were synthesized with a fluorescent 
label added to each forward primer (Supplementary Appendix A; www.karg- 
er.com/doi/10.1159/000075726). Loci were chosen primarily from the 
USDA MARC cattle mapping database (http://www.marc.usda.gov), but 
also included other published marker reports. Attempts were made to evenly 
space the markers on each chromosome and to choose loci with high hetero¬ 
zygosity and a large number of alleles based on cattle data. Additionally, 
markers were chosen at higher density for chromosomal regions previously 
reported to harbor growth QTL in cattle. Multiplexes were developed by 
combining loci with different fluorescent labels and different allele size dis¬ 
tributions in domestic cattle. PCR conditions were optimized in an attempt 
to maximize the number of loci included within a single reaction. Multiplex 
PCR conditions and thermal profiles can be found in Supplementary Appen¬ 
dices B and C (www.karger.com/doi/10.1159/000075726). All PCR was car¬ 
ried out using a GeneAmp PCR 9700 thermocycler (PE Biosystems). PCR 
products were separated on an ABI Prism 377 DNA Sequencer or an ABI 


Prism 310 Genetic Analyzer (PE Biosystems) and were sized relative to an 
internal size standard (MAPMARKER LOW, Bioventures). 

Mapping 

A linkage map for each bison autosome was constructed using CRI-MAP 
v. 2.4 (Green et al. 1990) and the BUILD option by incorporating loci into 
the map whose order was supported by a LOD score >3. Any remaining 
markers were subsequently incorporated into the map in the order of 
decreasing number of informative meioses using the ALL option. In many 
instances, the likelihood threshold had to be reduced to a LOD score of 2 in 
order to place these markers into the map due to a low number of informative 
meioses. The FLIPS option was used to evaluate local permutations of mark¬ 
er order. Finally, the CHROMPIC option was used to identify spurious dou¬ 
ble recombinants and to facilitate the correction of genotyping errors. 

QTL analysis 

Weights were collected from each of the bison herds over a 4-year period 
(1997-2000) using electronic scales. Weights were collected at approximately 
6 (W 6 ), 12 (W12) and 17 (W17) months of age corresponding to weaning, 
yearling and just prior to feedlot finishing. A total of 127 offspring (1998— 
2000) from both herds were used for weaning weight and 77 offspring for 
W17. Because HHP did not collect yearling weights, 97 offspring from the 
ABR herd (1997-2000) were used for the yearling weight analyses. For all 
QTL analyses, phenotypes were adjusted for age and gender. 

The program LOKI v2.4.5 (Heath, 1997) was used to generate exact esti¬ 
mates of multipoint identity by decent (MIBD) values via MCMC analysis. 
Using the sex averaged linkage map generated from CRI-MAP, MIBD values 
were estimated at 1 cM intervals along each chromosome. An initial burn-in 
of 1,000 iterations was followed by 500,000 iterations where estimates were 
collected every 5 th iterate. 

Multipoint QTL analysis was also performed using LOKI, however, 
2 million iterations were run after an initial burn-in of 10,000 iterates. A 
detailed description of the model and MCMC sampling process is described 
in Heath (1997). Briefly, the trait is modeled by k biallelic QTL where for 
QTL /, genotypes AiAi, A 1 A 2 , and A 2 A 2 have effects a^ du and -au respec¬ 
tively. The model for trait y can be expressed as: 

k 

y = p + Xfi+ X Qi a i + Zu + e 

i= 1 

where p is the overall trait mean, p is an (m x 1) vector of fixed effects and 
covariates, a t is a (2 x 1 ) vector of allele substitution effects for the i th QTL, u 
is an (n x 1 ; n animals each with a single observation) vector of random 
normally distributed additive residual polygene effects, e is an (n x 1 ) vector 
of normally distributed residuals, k is the number of QTL in the model, and 
X (n x m), Qi (n x 2) and Z(n* ri) are incidence matrices for the fixed, QTL 
and polygenic effects, respectively. LOKI offers the analytical advantage of 
allowing the number of QTL in the model to vary while simultaneously ana¬ 
lyzing the entire genome. 

Variance component interval mapping was also performed using the pro¬ 
gram SOLAR v2.0.4 (Almasy and Blangero 1998) according to the documen¬ 
tation accompanying the software. Multipoint interval analysis using the 
MIBD estimates obtained from LOKI was conducted at 1 cM intervals using 
the MULTIPOINT command. 


Results 

Mapping 

A total of 358 microsatellite loci were genotyped. In order to 
decrease costs and the time required for genotyping, 83 PCR 
multiplexes were developed containing 349 loci for an average 
of 4.2 loci per PCR. Nine loci were individually amplified due 
to unusual annealing temperatures or because primers were not 
compatible for multiplexing. Because some of the multiplexes 
and all of the individually amplified loci could be co-loaded, it 
was possible to genotype all 358 loci using only 72 lanes per 
animal for an average of 5.0 loci per lane. 
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Table 1. Number of loci mapped to bison chromosomes along with the 
size (Kosambi cM) of the sex specific and sex averaged maps. Sex averaged 
values for the corresponding cattle chromosomes are provided for compari¬ 
son (http://www.marc.usda.gov). 


BBI 

Number 

loci 

Cattle 

Bison 



Sex average 
(cM) 

Sex average 
(cM) 

Male 

(cM) 

Female 

(cM) 

1 

28 

124.2 

136 

125.4 

147.6 

2 

22 

120.4 

142.4 

130.8 

162.3 

3 

12 

103.5 

106.6 

98.2 

117.4 

4 

14 

88.8 

101.1 

79.2 

118.1 

5 

13 

118.3 

133.9 

121.3 

142.4 

6 

14 

104.7 

116.5 

104.1 

131.8 

7 

9 

124.4 

134.1 

121.5 

162.2 

8 

10 

112.2 

106.6 

87.3 

121 

9 

9 

101.9 

92.3 

84.1 

100.3 

10 

11 

79.4 

104.8 

107.3 

100.6 

11 

11 

96.9 

90.8 

83.4 

101.2 

12 

8 

105.8 

109 

105.1 

114.3 

13 

8 

65.8 

72.7 

53 

81.5 

14 

9 

74.5 

79.4 

69.9 

89.3 

15 

8 

81.3 

95.2 

86.5 

97 

16 

7 

71.7 

87.2 

93.7 

81.8 

17 

6 

85.8 

62.7 

60.3 

68.3 

18 

6 

76.1 

83.9 

77.3 

110.5 

19 

10 

99.5 

107.7 

97.6 

118.9 

20 

8 

52.6 

78.9 

72.1 

87.7 

21 

6 

56.3 

65.4 

66.1 

73.2 

22 

7 

81.1 

86.7 

78.1 

98.8 

23 

12 

58 

66.9 

63.1 

75 

24 

12 

56.5 

82.3 

82.9 

89.2 

25 

5 

54 

54.6 

39.3 

73.2 

26 

5 

57.1 

65.3 

55.8 

89.6 

27 

6 

64.1 

75.4 

68.8 

86 

28 

6 

39.6 

46.2 

39.8 

51.5 

29 

11 

60.7 

62.6 

45.9 

85.3 

Total 

292 

2415 

2647 

2398 

2976 


Only one locus (ILSTS065) failed to produce a PCR product 
despite numerous attempts to optimize PCR conditions. Addi¬ 
tionally, the only marker for which null alleles were detected 
was RM404 located on chromosome 25. Most of the remaining 
markers produced alleles near the expected size ranges in cattle, 
although several loci had allele size distributions that differed 
substantially from cattle. The average number of alleles per 
locus was 4.7 and the average observed heterozygosity per locus 
was 50.2%. 

The ability to place markers in a linkage map is ultimately 
dependent on the number of informative meioses and the 
genetic distance between flanking loci. Although the bison 
herds used in this study were numerically small and were not 
designed for mapping, a total of 292 markers were integrated 
into an autosomal linkage map for bison. Because of difficulty 
inferring phase for some markers, only 65 % of these loci could 
be ordered with LOD support >3. For these markers, marker 
order agreed with published marker order for cattle thus pro¬ 
viding preliminary evidence for conservation of synteny across 
the autosomes. Of the 66 loci which could not be mapped, 27 
were monomorphic, one would not amplify and 38 were biallel- 
ic and could not be mapped because in the majority of segregat¬ 
ing families all individuals were heterozygous making it impos¬ 
sible to infer phase. The sex averaged bison map (2,647 Kosam- 


Table 2. Most likely position, LOD and Bayes Factor statistical support 
and proportion of variance explained for weight QTL identified by both 
SOLAR and LOKI. 


Trait 

BBI 

SOLAR 


LOKI 




Position 

LOD 

Position 

BF 

V gQTL a 

V b 

v tQTL 

W6 

2 

3 

1.44 

1 

6.69 

0.19 

0.17 


7 

56 

1.44 

32 

5.05 

0.22 

0.20 


15 

110 

1.01 

107 

3.08 

0.17 

0.15 


24 

82 

1.11 

67 

19.50 

0.33 

0.30 

W12 

4 

14 

1.48 

17 

7.94 

0.30 

0.26 


14 

27 

1.32 

32 

5.73 

0.31 

0.23 


15 

56 

1.26 

65 

3.81 

0.27 

0.23 

W17 

8 

22 

1.52 

37 

3.25 

0.16 

0.16 


14 

63 

3.27 

62 

31.2 

0.40 

0.38 


25 

33 

1.23 

45 

3.93 

0.25 

0.24 


Proportion of the genetic variance due to the QTL. 
Proportion of the total variance due to the QTL. 


bi cM) was approximately 9 % longer than the corresponding 
USDA MARC map, which covered 2,415 cM (Table 1). This 
small degree of map inflation was expected due to the low num¬ 
bers of informative meioses and the relatively small number of 
scored loci. The female map (2,976 cM) was 11 % longer than 
the male map (2,398 cM). Figures representing both the sex 
average and sex specific bison linkage maps are available from 
the authors upon request. 

QTL scan 

SOLAR and LOKI generally yielded concordant results and 
indicated 10 putative QTLs on eight chromosomes (Table 2). 
The only chromosome to achieve a LOD score >3 was BBI14 
for W17. Test statistic profiles for BBI14 and 15 which showed 
evidence for two QTL influencing two different weight traits 
are presented in Fig. 1. Test statistic profiles for the remaining 
chromosomes can be found in Supplementary Fig. 1 (www. 
karger.com/doi/10.115 9/00007 5726). 

Discussion 

Markers 

The development of this panel of multiplexed markers will 
be of benefit for genetic diversity and QTL mapping studies of 
domestic cattle and their wild relatives. Given the close rela¬ 
tionship between bison and cattle and the results of Ward 
(2000), it was expected that virtually all of the cattle microsatel¬ 
lite loci would amplify in bison, however, the degree of varia¬ 
tion and thus the informativeness of these markers in bison was 
unknown. Of the 358 genotyped loci only one marker consis¬ 
tently failed to amplify while 27 (7.5%) were monomorphic in 
these two bison herds. Since the animals comprising the ABR 
and HHP herds were acquired from various regions of the 
country they should contain the majority of the genetic varia¬ 
tion found in plains bison (data not shown). However, it is like¬ 
ly that for some loci there are additional alleles present in bison 
that were not sampled in these herds. 
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Fig. 1. Interval analyses of BBI14 and 15 using SOLAR and LOKI. All SOLAR results are relative to the left axis which is a 
LOD score. All LOKI results are relative to the right axis which is a Bayes Factor. Marker locations (Haldane cM) are indicated by 
triangles. 

The average number of alleles in bison was 4.7 which is cattle from 20 populations as reported by MacHugh (1996). 

approximately half (9.3) the mean number of alleles for these Thus, there is approximately the same level of allelic diversity 

same markers found in the USDA MARC mapping population when bison are compared to individual cattle breeds or popula- 

which represents different cattle breeds from both B. taurus tions but bison have half the number of alleles as compared to 

and B. indicus. However, bison compare favorably to the mean cattle in total, 
number of alleles of 4.5 for 20 microsatellites examined in 728 
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The mean observed heterozygosity for the ABR and HHP 
herds was 50.2% (54.5% if monomorphic loci are excluded). 
This value is comparable to cattle in which Kappes et al. (1997) 
reported heterozygosities of 40.2% for linebred B. taurus , 
42.5% for purebred B. taurus , and 59% for Fi B. taurus. In 
addition, Grosz and MacNeil (2001) reported 57% heterozy¬ 
gosity for CGC dams and an Fi bull and MacHugh (1996) 
reported a mean heterozygosity of 55.1% in a sample repre¬ 
senting 20 populations from seven breeds. While bison may 
have lost some allelic diversity due to the population bottleneck 
of the latter part of the 1800’s, they appear to have maintained 
at least as much heterozygosity as has been retained in the dom¬ 
estication and selective breeding of cattle. This is probably due 
to the short duration of the bottleneck and the manner in which 
the foundation herds were started with animals from diverse 
geographic areas which were subsequently used to populate pri¬ 
vate herds. The relatively high levels of heterozygosity, reflect 
only a limited degree of inbreeding and explain why bison have 
not experienced many of the complications associated with a 
severe bottleneck. 

Mapping 

Since the majority of the markers employed in this study 
have never been tested in bison, there was no prior information 
concerning their informativeness except for the results from 
cattle. Even though bison had half the number of alleles as cat¬ 
tle and the family size was far smaller than has previously been 
used in cattle mapping projects, a total of 292 loci (82%) were 
mapped. The sex averaged bison map (2,647 cM) was approxi¬ 
mately 9% longer than the corresponding USDA MARC map, 
which spanned 2,415 cM (Table 1). Given the difference in the 
number of scored offspring and marker density, this map infla¬ 
tion was expected. As more offspring are genotyped (particu¬ 
larly with phase known meioses) and additional markers are 
added, it is expected that the Bison map will shrink and be very 
similar in length to the cattle map. 

QTL Scan 

The QTL scan identified eight chromosomes with similar 
statistical support for QTL using two different mapping ap¬ 
proaches. Each of these chromosomes (except BBI14 for W17) 
failed to reach statistical support even for “suggestive” linkage 
as determined by Lander and Kruglyak (1995). However, the 
estimates of QTL location were generally in agreement between 
the two mapping approaches. In Table 1, the maximum LOD 
scores for these chromosomes are in the range of 1.0-1.5, which 
was approximately four times the average of the maximum 
LOD scores from the chromosomes without evidence for QTL 
(data not shown). We believe that the magnitude of the maxi¬ 
mum LOD scores is due to the small family sizes rather than a 
lack of QTL segregating in bison. 

Because bison and cattle are so closely related, it is useful to 
compare the results presented here with prior QTL reports 
from cattle. Half of the chromosomes found in this study have 
conserved synteny with regions harboring QTL in cattle. Kim 
et al. (2003) reported QTL on BTA2 affecting yearling, slaught¬ 
er and hot carcass weight in the same region as the QTL on 
BBI2 affecting weaning weight. The most statistically signifi¬ 


cant QTL for 17-month weight on BBI14 is in the same region 
as a QTL affecting hot carcass weight in cattle (Kim et al., 
2003). Casas et al. (2001) identified a QTL on BTA4 affecting 
hot carcass weight in the region corresponding to that harboring 
the yearling weight QTL on BBI4. Although the QTL on BTI8 
identified by Casas et al. (2001) influenced fat depth, this QTL 
is in the same location as the yearling weight QTL identified in 
bison. 

Polziehn et al. (1995) and Ward et al. (1999) identified the 
occurrence of bison in public herds carrying domestic cattle 
mitochondrial DNA, which is indicative of domestic cattle 
introgression during and following the bottleneck. In a study of 
introgressive hybridization between bison and cattle Ward 
(2000) identified 22 microsatellite markers, distributed across 
12 autosomes, with differing allele size distributions between 
bison and cattle that can be used to distinguish bison and cattle 
alleles within these chromosomal regions. For these markers 
Ward (2000) identified domestic cattle alleles in five of the 14 
public North American bison populations indicating nuclear as 
well as mitochondrial introgression. Four of the eight bison 
chromosomes putatively harboring growth QTL (BBI2, 4, 14 
and 24) are included in the 12 chromosomes identified by 
Ward (2000). Most notably, the QTL on BBI14 affecting 17 
month weight is flanked by species indicative markers BMS947 
and BM4513, both of which showed no evidence for the intro¬ 
gression of domestic cattle alleles in the ABR or HHP herds. 
Although no hybrids were identified in these two herds for any 
of the chromosomes putatively harboring QTL, the possibility 
that the identified QTL were transferred laterally to bison from 
cattle cannot be excluded. 

The fact that two species divergent by a million years pos¬ 
sess similar QTL effects provides mutual support for the validi¬ 
ty of these QTL. However, it remains open as to whether these 
QTL represent the same genes and if so, whether there were 
independent mutations in the bison and bos lineages, whether 
the same mutations have persisted following the original diver¬ 
gence or whether the bison QTL actually represent domestic 
cattle introgression. The latter issues might be addressed by 
fine mapping and examination of these regions for the presence 
of cattle haplotypes. If these were detected, bison may provide a 
useful resource for the positional cloning of these QTL based 
upon the minimal common cattle haplotype within hybrid 
bison. 

The QTL identified on BBI7, 15, 24 and 25 do not have 
corresponding published QTL affecting weight reported in cat¬ 
tle. These QTL may not yet have been detected in cattle, may 
represent differences between bison and cattle, or they may be 
false positives. Due to the lack of power inherent to these pedi¬ 
grees, this issue will not be resolved until additional offspring 
are added to the pedigrees or additional pedigrees are exam¬ 
ined. 

Conclusions 

The development of the multiplex PCR protocols for this 
study represents, to our knowledge, the largest set of multi¬ 
plexed microsatellite markers published to date for a livestock 
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species. These protocols should serve as the starting point for 
other laboratories wishing to optimize these loci based on other 
instrumentation and techniques. The cross-compatibility of 
most of the bovine markers enables markers to be multiplexed 
very easily and therefore should aid in the development of 
additional panels to address specific genome scan or diversity 
applications. The bison linkage map and marker data produced 
in this study should serve as the foundation for future mapping 
and population studies in bison. The population history of 
bison, their relationship to cattle and the fact that bison have 
only been under artificial selection for approximately 50 years 
offers a unique opportunity to study the genomes of two differ¬ 


ent but complementary species. The likelihood of positional 
cloning of QTL may be significantly enhanced if the same QTL 
appears to be segregating in different species. 
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Abstract. Sixty-four genomic BAC-clones mapping five 
type I (ADCYAP1, HRH1, IL3, RBP3B and SRY) and 59 type 
II loci, previously FISH-mapped to goat (63 loci) and cattle 
(SRY) chromosomes, were fluorescence in situ mapped to river 
buffalo R-banded chromosomes, noticeably extending the 
physical map of this species. All mapped loci from 26 bovine 
syntenic groups were located on homeologous chromosomes 


and chromosome regions of river buffalo and goat (cattle) chro¬ 
mosomes, confirming the high degree of chromosome homeo- 
logies among bovids. Furthermore, an improved cytogenetic 
map of the river buffalo with 293 loci from all 31 bovine syn¬ 
tenic groups is reported. 

Copyright©2003 S. Karger AG, Basel 


Genomes of domestic animals are practically unknown 
when compared with those of both humans and mice. In cattle, 
the most studied species, 4,109 loci are mapped with 1,503 
being genes (BovBase, http://locus.jouy.inra.fr/cgi-bin/lgbc/ 
mapping/bovmap/Bovmap/main.pl, April 2003), mostly map¬ 
ped to large chromosome regions. Indeed, only 642 loci were 
assigned to specific chromosome regions or bands (BovBase). 
In other bovids, especially in the river buffalo {Bubalus bubalis , 
2n = 50, BBU) physical maps are relatively poor. In fact, only 
54 (lannuzzi, 1998) and 99 (El Nahas et al., 2001) loci were 
reported in the previous two maps of this economically impor¬ 
tant species. Indeed, more than 130 million of river buffaloes 
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are raised all in the world for both meat and milk production. 
Autosome band comparisons between river buffalo and other 
bovids (cattle, in particular) have revealed a high degree of 
banding homeologies and that the five river buffalo biarmed 
pairs originated from five centric fusion translocations involv¬ 
ing ten cattle (bovid ancestor) chromosomes (CSKBB, 1994). 
These events were accompanied by loss of constitutive hete¬ 
rochromatin (HC) when comparing the C-banding patterns of 
river buffalo with those of cattle (lannuzzi et al., 1987). 

Sex chromosomes originated by complex rearrangements 
(transpositions with loss or acquisition of constitutive hetero¬ 
chromatin) (lannuzzi et al., 2000b). The autosome homeologies 
(and sex chromosome divergences) have been confirmed when 
comparative physical maps among river buffalo, cattle and 
sheep were performed by using the same molecular markers 
(lannuzzi, 1998; lannuzzi et al., 2000a, b, c, 2001a, b; Di Meo 
et al., 2000, 2002; El Nahas et al., 2001). 

The use of specific molecular markers and the FISH tech¬ 
nique has resulted in noticeably extended cytogenetic maps of 
several domestic species. In particular, the localization of both 
type I and type II markers to specific chromosome regions and 
bands adds detailed information about the physical organiza¬ 
tion of animal genomes, allowing more detailed comparisons 
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among related and unrelated species to be performed (Schibler 
et al., 1998; Di Meo et al., 2000, 2002; Iannuzzi et al., 2000a, b, 
c, 2001a, b). These comparisons were particularly important in 
demonstrating the chromosome rearrangements that have oc¬ 
curred during the karyotypic evolution of bovids (Piumi et al., 
1998; Iannuzzi et al., 2000b, 2001a), as well as to reveal bal¬ 
anced chromosomal abnormalities associated with clinical 
cases (Iannuzzi et al., 200Id, e). Furthermore, cytogenetic map¬ 
ping of type II markers established connection among physical, 
genetic and comparative maps, providing tools necessary to 
show putative associations between genes and production 
traits. 

Because of the high chromosome homeology among bovids 
(Hayes et al., 1991; Gallagher and Womack, 1992; Iannuzzi 
and Di Meo, 1995; ISCNDB2000, 2001), cattle and goat 
genome maps have been used as templates in concert with cat¬ 
tle and goat derived genomic clones, to extend the river buffalo 
cytogenetic map. Iteratively, all bovine syntenic groups were 
assigned to all river buffalo chromosomes (or chromosome 
arms) (Iannuzzi, 1998; El Nahas et al., 2001). Furthermore, all 
31 bovine syntenic markers (Texas nomenclature, 1996) were 
FISH-mapped on both Q/G and R-banded cattle chromosomes 
(Hayes et al., 2000), and these assignments represent the basis 
for construction of the latest standard karyotypes of cattle, 
sheep and goat (ISCNDB 2000). More recently, the same synt¬ 
enic markers were FISH-mapped in river buffalo (Iannuzzi et 
al., 2001c), further supporting chromosome homeologies and 
conserved syntenies between river buffalo and other bovids 
(Iannuzzi et al., 2001c; ISCNDB 2000). 

In the present study, 64 loci earlier assigned to goat (63 loci, 
Schibler et al., 1998; GoatBase, http://locus.jouy.inra.fr/cgi- 


bin/lgbc/mapping/common/main.pl?BASE=goat, April 2003) 
and cattle (SRY, BovBase) chromosomes and representing 26 
bovine syntenic groups, were FISH-mapped on river buffalo 
R-banded chromosomes to extend the physical map of this spe¬ 
cies. Furthermore, an improved river buffalo cytogenetic map 
with a total of 293 loci is reported. 

Materials and methods 

Concanavalin A stimulated peripheral blood lymphocytes were cultured 
for three days in RPMI medium at 37.8 °C (CO 2 incubator) and treated with 
5-BrdU (15 pg/ml) and Hoechst 33258 (30pg/ml) six hours before harvesting 
(late incorporation) to obtain R-banding pattern preparations. Several cul¬ 
tures were also treated with thymidine block (300 pg/ml for about 18 h). To 
remove thymidine cells were washed twice with Puck’s saline solution, and 
BrdU and Hoechst 33258 (as reported above) in fresh medium was added. 
Colcemid treatment (0.05 pg/ml and 0.5 pg/ml in unsynchronized and syn¬ 
chronized cells, respectively) was performed for 1 and 1.5 h, respectively. 
Slides were kept at -20° C in slide boxes until use (several months). Slides 
were stained with Hoechst 33258 (20 pg/ml) for 10 min, then washed with 
distilled water, mounted in 2x SSC, pH 7.0 with glass coverslips, exposed to 
UV light for 30 min and washed with distilled water. Slides were then treated 
for FISH with caprine (Schibler et al., 1998) and bovine (SRY gene) BAC- 
clones for about three days (weekend) in presence of bovine COT-1 DNA and 
allocated in a moist chamber. After detection steps with FITC-avidin and 
anti-avidin antibody (Oncor), slides were mounted with Antifade/Hoechst 
33258 dye (3 pg/ml). Both R-banded metaphases (RBH-banding) and fluo¬ 
rescent FITC signals were separately captured by a CCD-camera (Sensys, 
Photometries) and processed by superimposing FITC signals on RBH- 
banded preparations. Chromosome identification and banding followed the 
river buffalo standard karyotype (CSKBB, 1994). Gene symbols, names and 
chromosome homeologies were in agreement with the BovBase and ISCNDB 
2000, or with the GoatBase, when data were not available at the BovBase. 


Fig. 1 . Simultaneous visualization of FITC 
signals (arrows) and RBH-banding in river buffa¬ 
lo chromosomes with BAC-clones containing 
D1S33 (a), D29S10 (b), D17S22 (c) and D20S10 
(d) mapping to BBUlq45, BBU5pl9, BBU17ql3 
and BBU19q 15, respectively. 
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Results and discussion 

Details of river buffalo chromosome preparations showing 
simultaneous disclosure of FITC signals and RBH-banding (R- 
banding by late incorporation of BrdU and Hoechst 33258 dye 
staining) are shown in Fig. 1. At least 15 metaphases for each 
BAC clone were studied. The frequency of FITC signals (chro¬ 
mosomes with single or double-spot in one or both chromatids) 
varied from 34% (CSSM54) to 68% (TGLA304). Sixty-four 
loci were FISH-mapped: five (ADCYAP1, HRH1, IL3, RBP3B 


and SRY) were of type I, the remaining 59 being of type II (Ta¬ 
ble 1). The chromosomal localizations of all 63 autosomal loci 
correspond with those earlier found in goat (Schibler et al., 
1998; GoatBase). With the limits of different banding tech¬ 
niques applied by different groups and different ideograms 
(choice of landmark and chromosome regions) adopted in river 
buffalo (CSKBB, 1994) and cattle (goat) (ISCNDB 2000) stan¬ 
dard karyotypes, the loci appeared to be localized not only on 
homeologous chromosomes but also in homeologous chromo¬ 
some regions of river buffalo and goat (cattle). 


Table 1. Alphabetical list and chromosome localization of loci mapped in river buffalo (BBU), cattle (BTA, BovBase) or goat (CHI, GoatBase), relative 
bovine syntenic group (U), mode and references 


Locus symbol 

Locus name 

BBU 

BTA 

U 

Mode 3 

References 




CHI* 




ABL1 

Abelson murine viral oncogene homolog 

12 

11 

16 

SR 

El Nahas et al., 1996a 

ACADM 

acyl-Coenzyme A dehydrogenase, C-4 to C-12 straight chain 

6q33 

3q33* 

6 

FISH 

Iannuzzi et al., 2000a 

ACTA1 

alpha skeletal actin 

4p 

28 

29 

SR 

El Nahas et al., 1996a 

ACTA2 

actin, alpha 2, smooth muscle aorta 

23ql3 

26ql7* 

26 

FISH 

Iannuzzi et al., 2001a 

ADA 

adenosine deaminase ADA 

14q24 

13q24* 

11 

FISH 

Iannuzzi et al., 2001a 

ADCYAP 1 

adenylate cyclase activating polypeptide 1 

22q22 

24q22* 

28 

FISH 

present study 

AHSG 

alpha 2 HS glycoprotein 

lq33 

lq33-34* 

10 

FISH 

Iannuzzi et al., 2000a 

AK2 

adenylate kinase isozyme 2 

2q35 

2q45* 

17 

FISH 

Iannuzzi et al., 2001a 

ALAS 

aminolevulinate, delta, syntetase 2 

Xq36 

Xq32-33 

X 

FISH 

Iannuzzi et al., 2000b 

ALPI 

intestinal alkaline phosphatase 

2q35 

2q45 

17 

FISH 

Iannuzzi et al., 2000c 

AMD1 

adenosylmethionin decarboxylase 

1 Oq 17 

9ql9* 

2 

FISH 

Iannuzzi et al., 2001a 

ANK1 

ankyrin 1, erythrocytic 

lp23 

27q24* 

25 

FISH 

Iannuzzi et al., 2000a 

ANTI 

adenine nucleotide translocator 1 

lp 

27 

25 

SP 

El Nahas etal., 1997 

APOA1 

apolipoprotein Al 

16q21 

15q21* 

19 

FISH 

Iannuzzi et al., 2001a 

AR 

androgen receptor 

Xq32-33 

Xq25-26 

X 

FISH 

Iannuzzi et al., 2000b 

ASIP 

agouti signaling protein, nonagouti homolog 

14q22 

13q22* 

11 

FISH 

Iannuzzi et al., 2001a 

ASS 

arginosuccinate synthetase 

12q38 

1lq28* 

16 

FISH-SR 

Iannuzzi et al., 2001a; El Nahas et al., 1996a 

AVP 

arginine vasopressin (neurophysin II, antidiuretic hormone, 
diabetes insipidus, neurohypophyseal) 

14q22 

13q21 -22 

11 

FISH 

Iannuzzi et al., 2001b 

BF 

B-factor, properdin 

2p22 

23q22* 

20 

FISH 

Di Meo et al., 2000 

BM0848 

DNA segment 

16q2.11 

15q29* 

19 

FISH 

present study 

BM4208 

DNA segment 

15q25 

14q26* 

24 

FISH 

present study 

BM723 

DNA segment 

6q21 

3q23* 

6 

FISH 

present study 

BMC8012 

DNA segment 

5p 15 

29ql5* 

7 

FISH 

present study 

BoWCll 

bovine workshop cluster 11 

5 

29 or 16 


SI 

Abou-Mossallem, 1999; El Nahas et al., 2001 

BSPN 

brain specific protein I amino chain 

15 

14 

24 

SP 

de Hondt et al., 1997 

BTK 

bruton agammaglobulinemia tyrosine kinase 

Xq24-25 

Xql 1-12 

X 

FISH 

Iannuzzi et al., 2000b 

BULA 

major histocompatibility complex, class I 

2p22 

23q22 

20 

FISH 

Iannuzzi et al., 1993c 

(MHC) 

BULA-DYA 

major histocompatibility complex, class II, DY alpha (DYA 
class II MHC leucocyte antigens) 

2p 13 

23q 13 

20 

FISH 

Iannuzzi et al., 2001c 

C9 

complement component 9 

19q 19 

20ql7* 

14 

FISH 

Iannuzzi et al., 2001a 

CAD 

aspartate transcarbamylase 

12q34 

1lq24* 

16 

FISH 

Iannuzzi et al., 2001a 

CAST 

calpastatin 

9q2.14 

7q26-27* 

22 

FISH 

Iannuzzi et al., 2001a 

CATHL@ 

cathalecidins clone 66e23 

21q24 

22q24 

12 

FISH 

Iannuzzi et al., 1998a 

CD14 

antigen CD 14, LPS-binding protein 

12 

11 

16 

SI 

El Nahas et al., 1996b 

CD 18 

antigen CD 18, lymphocyte function-associated antigen 

iq 

1 

10 

SP 

de Hondt et al., 1997 

CD71 

antigen CD71, transferrin receptor 

iq 

1 


SI-SC 

El Nahas et al., 1996b; Ramadan et al., 2000 

CD81 

antigen CD81 (TAPA-1) 

22 

24 

28 

SI 

Abou-Mossallem, 1999 

CGA 

glicoprotein hormones, alpha polypeptide 

10q22 

9q22* 

2 

FISH-SP 

Iannuzzi et al., 2001a; de Hondt et al., 1997 

CGN1 

conglutinin 1 

4pl6 

28q 17 

29 

FISH-SR 

Iannuzzi et al., 1994a, 2001c; El Nahas et al., 

1996a 

CHGA 

chromogranin A 

20q24 

21q23* 

4 

FISH 

Iannuzzi et al., 2001a 

CHRNA7 

nicotinic acetyl choline receptor a7 subunit 

20ql7 

21 q 17 

4 

FISH 

Iannuzzi et al., 2001a 

CHRNB1 

cholinergic receptor, nicotinic, (3-polypeptide) 

3pl5 

19q 15 

21 

FISH 

Iannuzzi et al., 1999 

CLCN1 

chloride channel 1, skeletal muscle (Thomsen disease, 
autosomal dominant) 

8q34 

4q33-35* 

13 

FISH 

Di Meo et al., 2000 

COL3A1 

collagen, type III, alpha 1 

2ql2 

2q 12 

17 

FISH 

Iannuzzi et al., 2000c 

COL9A1 

collagen type IX, alpha 1 

1 Oq12-13 

14ql2* 

2 

FISH 

Iannuzzi et al., 2001a 

Cosl489 

DNA segment 

Xq47 

Xq43.1 

X 

FISH 

Prakash et al., 1997 

Cos314 

DNA segment 

Xql 1.3- 
12 

Xpll-12 

X 

FISH 

Prakash et al., 1997 
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Table 1 (continued) 


Locus symbol 

Locus name 

BBU 

BTA 

CHI* 

U 

Mode 3 

References 

Cos945 

DNA segment 

Xq34 

Xq24 

X 

FISH 

Prakash et al., 1997 

COSAE7 

DNA segment 

22q24 

24q23 

28 

FISH 

Iannuzzi et al., 1998a 

COX8 

cytochrome c oxidase subunit CIX (VIII) 

5p 19 

29q24* 

7 

FISH 

Iannuzzi et al., 2000a 

CP 

ceruloplasmin (ferroxidase) 

lq43 

lq41* 

10 

FISH 

Iannuzzi et al., 2000a 

CRH 

corticotropin releasing hormone 

15q 16-17 

14ql9* 

24 

FISH 

Iannuzzi et al., 2001a 

CRP 

C-reactive protein, pentraxin-related 

6q 13 

3ql2-13* 

6 

FISH 

Iannuzzi et al., 2000a 

CRYAA 

crystallin alpha A 

lq47 

lq45* 

10 

FISH 

Iannuzzi et al., 2000a 

CRYBA1 

crystallin, beta polypeptide A1 

3p 15 

19q 15 

21 

FISH 

Iannuzzi et al., 1999 

CRYG 

crystallin a-polypeptide 

2q 

2 

17 

SR 

El Nahas et al., 1996a 

CSN10 

casein, kappa 

7q32 

6q32 

15 

FISH 

Iannuzzi et al., 2001c 

CSN1S2 

alpha-S2-casein 

7q32 

6q31 

15 

FISH 

Iannuzzi et al., 1996a 

CSRM60 

DNA segment 

11 

10 

5 

SP 

de Hondt et al., 2000 

CSSM30 

DNA segment 

14q22 

13q22* 

11 

FISH 

Iannuzzi et al., 2001b 

CSSM41 

DNA segment 

21 

22 

12 

SP 

de Hondt et al., 2000 

CSSM47 

DNA segment 

3q27 

8q27* 

18 

FISH 

present study 

CSSM6 

DNA segment 

21 

22 

12 

SP 

de Hondt et al., 2000 

CSSM66 

DNA segment 

15q 13 

14q15-16* 

24 

FISH 

present study 

CTSLL 

cathepsin L 

3q25 

8q25-26* 

18 

FISH 

Iannuzzi et al., 2001a 

CYP19 

aromatase (cytochrome P450, subfamily XIX) 

1 lq26 

10q26 

5 

FISH 

Iannuzzi et al., 2001a 

D10S10 

DNA segment (TGLA272) 

1 lq36-37 

10q34* 

5 

FISH 

present study 

D10S2 

DNA segment (TGLA378) 

1 lq22 

10q24* 

5 

FISH 

present study 

D10S25 

DNA segment (ILSTS005) 

1 lq36-37 

10 

5 

FISH 

present study 

D10S36 

DNA segment (JAB 10) 

11 q 13 

1 Oq 15 

5 

FISH 

Iannuzzi et al., 1998a 

D11S63 

DNA segment (ILSTS028) 

12q38 

1lq27* 

16 

FISH 

present study 

D11S7 

DNA segment (CSSM52) 

12 

11 

16 

SP 

Othman and El Nahas, 1999 

D12S16 

DNA segment (IDVGA41) 

13q 15 

12q 15 

27 

FISH 

Iannuzzi et al., 1997a 

D12S2 

DNA segment (TGLA 9) 

13 

12 

27 

SP 

Oraby et al., 1998 

D12S21 

DNA segment (IDVGA57) 

13q 13 

12q 13 

27 

FISH 

Di Meo et al., 2002 

D13S11 

DNA segment (BL42) 

14q22 

13q22 

11 

FISH 

Iannuzzi et al., 2001b 

D13S13 

DNA segment (BMC 1222) 

14q 13 

13q 12-13 

11 

FISH 

present study 

D13S31 

DNA segment (IDVGA87) 

14q24 

13q22-23 

11 

FISH 

Di Meo et al., 2002 

D14S2 

DNA segment (CSSM36) 

15 

14 

24 

SP 

de Hondt et al., 1997 

D14S35 

DNA segment (IDVGA76) 

15q 15 

14q 14 

24 

FISH 

Iannuzzi et al., 1998a 

D15S13 

DNA segment (IDVGA10) 

16q27 

15q25 

19 

FISH 

Di Meo et al., 2002 

D15S16 

DNA segment (IDVGA32) 

16q25 

15q25 

19 

FISH 

Iannuzzi et al., 1997a 

D16S21 

DNA segment (IDVGA49) 

5q21 

16q 17 

1 

FISH 

Iannuzzi et al., 1997b 

D16S23 

DNA segment (IDVGA68) 

5q21 

16q 16 

1 

FISH 

Di Meo et al., 2002 

D16S30 

DNA segment (IDVGA26) 

5q25 

16q21 

1 

FISH 

present study 

D16S34 

DNA segment (IDVGA66) 

5ql 1-12 

16q 12 

1 

FISH 

Di Meo et al., 2002 

D16S8 

DNA segment (HUJ614) 

5q 12 

16 

1 

SP 

El Nahas et al., 1999 

D16S9 

DNA segment (HUJ625) 

5q27 

16 

1 

FISH 

present study 

D17S21 

DNA segment (OarFCB048) 

17q 15 

17ql5* 

23 

FISH 

present study 

D17S22 

DNA segment (OarVH098) 

17q 13 

17ql3* 

23 

FISH 

present study 

D18S1 

DNA segment (TGLA227) 

18 

18 

9 

SP 

Oraby et al., 1998 

D18S17 

DNA segment (Hautl4) 

18q22 

18q21 

9 

FISH 

present study 

D18S2 

DNA segment (UWCA5) 

18 

18 

9 

SP 

Oraby et al., 1998 

D18S21 

DNA segment (INRA210) 

18q24 

18q24* 

9 

FISH 

present study 

D18S5 

DNA segment (INRA063) 

18q24 

18q22* 

9 

FISH 

present study 

D19S18 

DNA segment (IDVGA46) 

3p22 

19q 16 

21 

FISH 

Di Meo et al., 2002 

D19S19 

DNA segment (IDVGA47) 

3p22 

19q 17 

21 

FISH 

Iannuzzi et al., 1997b 

D19S26 

DNA segment (IDVGA58) 

3p 15 

19q 15 

21 

FISH 

Di Meo et al., 2002 

D1S33 

DNA segment (BL28) 

lq45 

lq42 

10 

FISH 

present study 

D1S4 

DNA segment (MAF 46, ovine) 

lq 

1 

10 

SP 

de Hondt et al., 1997 

D1S88 

DNA segment (BMS1757) 

lq45 

1 

10 

FISH 

present study 

D20S10 

DNA segment (TGLA304) 

19q 15 

20ql4* 

14 

FISH 

present study 

D20S13 

DNA segment (BM3517) 

19q 13 

20ql2* 

14 

FISH 

present study 

D21S11 

DNA segment (CSSM18) 

20 

21 

4 

SP 

de Hondt et al., 2000 

D21S12 

DNA segment (INRA031) 

20q22 

21q21* 

4 

FISH 

present study 

D21S4 

DNA segment (ETH 131) 

20 

21 

4 

SP 

de Hondt et al., 2000 

D21S43 

DNA segment (IDVGA79) 

20ql5 

21q 14-15 

4 

FISH 

Di Meo et al., 2002 

D21S45 

DNA segment (ILSTS052) 

20q 13 

21q14* 

4 

FISH 

present study 

D24S3 

DNA segment (CSSM31) 

8(22) 

24 

13 

SP 

de Hondt et al., 2000 

D25S12 

DNA segment (IDVGA71) 

24ql3 

25ql2 

8 

FISH 

Iannuzzi et al., 1997a 

D26S14 

DNA segment (IDVGA59) 

23q22 

26q22 

26 

FISH 

Iannuzzi et al., 1997a 

D27S10 

DNA segment (BM6526) 

1 p21 

27ql2.2* 

25 

FISH 

present study 

D27S4 

DNA segment (CSSM043) 

lp23 

27q22* 

25 

FISH 

present study 

D27S6 

DNA segment (HUJI13) 

lp25 

27q24* 

25 

FISH 

present study 

D28S10 

DNA segment (IDVGA8) 

4pl6 

28q18-19 

29 

FISH 

present study 

D28S12 

DNA segment (IDVGA29) 

4pl2 

28q 13 

29 

FISH 

Di Meo et al., 2002 

D28S2 

DNA segment (ETH1112) 

4p 

28 

29 

SP 

El Nahas et al., 1999 

D29S10 

DNA segment (BMC 1206) 

5p 19 

29q24 

7 

FISH 

present study 

D29S16 

DNA segment (INRA143) 

5p 13 

29ql2 

7 

FISH 

present study 


68 


Cytogenet Genome Res 102:65-75 (2003) 






Table 1 (continued) 


Locus symbol 

Locus name 

BBU 

BTA 

CHI* 

U 

Mode 3 

References 

D29S2 

DNA segment (IDVGA7) 

5pl9 

29q24 

7 

FISH 

Iannuzzi et ah, 1997b 

D2S18 

DNA segment (AR028) 

2q35 

2q43* 

17 

FISH 

present study 

D2S21 

DNA segment (OarFCB020) 

2q25 

2q23* 

17 

FISH 

present study 

D2S25 

DNA segment (IDVGA64) 

2q35 

2q44 

17 

FISH 

present study 

D2S27 

DNA segment (ILSTS082) 

2q29 

2q31-32* 

17 

FISH 

present study 

D2S53 

DNA segment (INRA231) 

2q35 

2q45* 

17 

FISH 

present study 

D3S2 

DNA segment (CSSM54) 

6q21 

3q22* 

6 

FISH 

present study 

D3S25 

DNA segment (IDVGA35) 

6q37 

3q35 

6 

FISH 

Di Meo et ah, 2002 

D3S29 

DNA segment (IDVGA53) 

6ql5 

3q21 

6 

FISH 

Iannuzzi et ah, 1997a 

D3S4 

DNA segment (HUJ177) 

6q35 

3q33* 

6 

FISH 

present study 

D4S34 

DNA segment (IDVGA84) 

8q32 

4q32 

13 

FISH 

Di Meo et ah, 2002 

D4S7 

DNA segment (CSSM14) 

8 

4 

13 

SP 

Othman and El Nahas, 1999 

D5S15 

DNA segment (BMC 1009) 

4q21 

5q23 

3 

FISH 

present study 

D5S25 

DNA segment (BM2830) 

4q37 

5q35* 

3 

FISH 

present study 

D6S14 

DNA segment (BM1329) 

7q21 

6q 15 * 

15 

FISH 

present study 

D6S17 

DNA segment (IDVGA61) 

8q34 

4q32 

13 

FISH 

Iannuzzi et ah, 1997a 

D7S3 

DNA segment 

9 

7 

22 

SP 

de Hondt et ah, 1997 

D8S11 

DNA segment (TGLA010) 

3q 15 

8q 15 * 

18 

FISH 

present study 

D9S1 

DNA segment (ETH 225) 

10 

9 

2 

SP 

de Hondt et ah, 1997 

D9S3 

DNA segment (TGLA73) 

10q24 

9q23* 

2 

FISH 

present study 

D9S5 

DNA segment (INRA127) 

1 Oq 15 

9ql4* 

2 

FISH 

present study 

DEFB@ 

beta-defensin 

lpl2 

27q13-14 

25 

FISH 

Iannuzzi et ah, 1996b, 2001c 

DIK26 

DNA segment (411G12) 

lq28 

1 

10 

FISH 

present study 

DMD5 

dystrophin (muscular dystrophy, Duchenne and Becker 
types), includes DXS142, DXS164, DXS206, DXS230, 
DXS239, DXS268, DXS269, DXS270, DXS272 (361B7) 

Xq41-42 

Xq33-34 

X 

FISH 

Iannuzzi et ah, 2000b 

DSC1 

desmocollin 1 

22q22 

24q21 

28 

FISH 

Iannuzzi et ah, 2001c 

DVEPC076 

DNA segment 

Xq23 

Xpll-12 

X 

FISH 

Iannuzzi et ah, 2000b 

DVEPC102 

DNA segment 

Xq44 

Xq35 

X 

FISH 

Iannuzzi et ah, 2000b 

DVEPC107 

DNA segment 

Xq24-25 

Xq21-22 

X 

FISH 

Iannuzzi et ah, 2000b 

DVEPC109 

DNA segment 

Xq24-25 

Xq23 

X 

FISH 

Iannuzzi et ah, 2000b 

DVEPC113 

DNA segment 

lq47 

lq45* 

10 

FISH 

present study 

DVEPC119 

DNA segment 

lq45 

lq44-45 

10 

FISH 

present study 

DVEPC132 

DNA segment 

Xq25-31 

Xq21 

X 

FISH 

Iannuzzi et ah, 2000b 

DVEPC137 

DNA segment 

Xq23 

Xpll 

X 

FISH 

Iannuzzi et ah, 2000b 

DXS113 

DNA segment (DVEPC053) 

Xq45 

Xq41 

X 

FISH 

Iannuzzi et ah, 2000b 

DXS30 

DNA segment (IDVGA82) 

Xq44 

Xq34 

X 

FISH 

Iannuzzi et ah, 1998a 

DXS44 

DNA segment (DVEPC014) 

Xq21 

Xp22 

X 

FISH 

Iannuzzi et ah, 2000b 

DXS51 

DNA segment (DVEPC027) 

Xq38 

Xq32 

X 

FISH 

Iannuzzi et ah, 2000b 

DXS54 

DNA segment (DVEPC041) 

Xq23 

Xpll-12 

X 

FISH 

Iannuzzi et ah, 2000b 

DXS61 

DNA segment (DVEPC052) 

Xq21 

Xp22 

X 

FISH 

Iannuzzi et ah, 2000b 

DXS67 

DNA segment (DVEPC065) 

Xq31-32 

Xp24 

X 

FISH 

Iannuzzi et ah, 2000b 

DXYS3 

DNA segment (TGLA325) 

Xq46-47 

Yq21-22 

Xq43 

Ypl3 

X 

FISH 

Iannuzzi et ah, 2000b 

DYZ10 

DNA segment (IDVGA50) 

Yql2-1.10 

Yql 1-12.3 

Y 

FISH 

Iannuzzi et ah, 1998a 

EDN1 

endothelin 1 

2p24 

23q25* 

20 

FISH 

Di Meo et ah, 2000 

EDNRB 

endothelin receptor type B 

13q22 

12q22 

27 

FISH 

Iannuzzi et ah, 2001a 

EEF2 

elongation factor 2 

9q 15 

7q 15 

22 

FISH 

Iannuzzi et ah, 1997c 

ELN 

elastin (supravalvular aortic stenosis, Williams-Beuren 
syndrome) 

24q21 

25q22 

8 

FISH 

Iannuzzi et ah, 2001c 

F10 

coagulation factor X 

13 

12 

27 

SP 

Orabyetah, 1998 

FDXl(AD) 

ferredoxin, ion sulfur electron transport protein for 
mitochondrial P450s 

16q21 

15q14* 

19 

FISH 

Iannuzzi et ah, 2001a 

FGG 

fibrinogen, gamma polypeptide 111B7 

17q 13 

17q 13 

23 

FISH 

Iannuzzi et ah, 2001c 

FN1 

fibronectin 

2q 

2 

17 

SP 

Othman and El Nahas, 1999 

FSHB 

follicle stimulating hormone, beta polypeptide 114R2C11 

16q28 

15q26-27 

19 

FISH-SP 

Iannuzzi et ah, 2001c; Oraby et ah, 1998 

FUCAIP 

fucosidase alpha-L-1 tissue 

2q 

2 

17 

SR 

El Nahas et ah, 1996a 

GAPD 

glyceraldehyde-3phosphate dehydrogenase 

4q 

5 

3 

SZ-SC 

de Hondt et ah, 1991; El Nahas et ah, 1993 

GAS 

gastrin 

3p22 

19q 17* 

21 

FISH 

Iannuzzi et ah, 2001a 

GGTA1 

alpha-galactosyltansferase 1 

12q36 

1 lq26 

16 

FISH 

Iannuzzi et ah, 1997c 

GH1 

growth hormone 1 

3p24 

19q22 

21 

FISH 

Iannuzzi et ah, 1999; 2001c 

GJA1 

(CX43) 

gap junction protein, alpha 1, 43kD (connexin 43) 

1 Oq 17 

9q15-16 

2 

FISH 

Iannuzzi et ah, 1998a 

GLI2 

gli-kruppel family member 2 

2q29 

2q31* 

17 

FISH 

Iannuzzi et ah, 2001a 

GNAS1 
(BIO 145) 

guanine nucleotide binding protein (G protein), alpha 
stimulating activity polypeptide 1 

14q22 

13q22 

11 

FISH 

Iannuzzi et ah, 2001b 

GNRHR 

gonadotropin-releasing hormone receptor 

7q32 

6q31* 

15 

FISH 

Iannuzzi et ah, 2000d 

GPI 

glucose phosphate isomerase 

18q24 

18q24 

9 

FISH 

Iannuzzi et ah, 2001c 

GRP58 

glucose regulated protein 58kD 

20q24 

21 

4 

FISH 

Iannuzzi et ah, 2001a 

GSTA1 

glutathione S-transferase A1 

2p22 

23q 15-21 * 

20 

FISH 

Iannuzzi et ah, 2000d 

HBA 

hemoglobin alpha 1 

24q 13 

25 

8 

FISH 

Iannuzzi et ah, 2000a 

HBB 

hemoglobin,beta 

16q25 

15q22-27 

19 

FISH-SP 

Iannuzzi et ah, 2001; Oraby et ah, 1998 
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Table 1 (continued) 


Locus symbol 

Locus name 

BBU 

BTA 

CHI* 

U 

Mode 3 

References 

HCK (BI089) 

haemopoietic cell kinase 

14q22 

13q22 

11 

FISH 

Iannuzzi et al., 2001b 

HEXA 

beta-N acetyl glucosaminidase 

11 

10 

5 

SZ 

El Nahta, 1996; El Nahas et al., 2001 

HRH1 

histamine receptor HI 

21q24 

22q24* 

12 

FISH 

present study 

HSD3B1 

hydroxy-delta-5-steroid dehydrogenase, 3 beta- and steroid 
delta-isomerase 1 

219F9 

6q21 

3q21 

6 

FISH 

Iannuzzi et al., 2001c 

IDVGA70 

DNA segment 

18q22 

18q21 

9 

FISH 

Di Meo et al. 2002 

IDVGA74 

DNA segment 

18q 13 

18q 13 

9 

FISH 

Di Meo et al. 2002 

IFN1@ 

interferon, alpha, leukocyte 

3q 15 

8q 15 

18 

FISH 

Iannuzzi et al., 2001c 

IFNG 

gamma interferon 

4q23 

5q23 

3 

ISH-FISH 

Hassanane et al., 1994; Di Meo et al., 2000; 
Iannuzzi et al., 2001c 

IFNT 

trophoblast interferon 

3q 15 

8q 15 

18 

FISH 

Iannuzzi et al., 1993a 

IFNW 

omega interferon 

3q 15 

8q 15 

18 

FISH 

Iannuzzi et al., 1993a 

IGF1 

insulin-like growth factor 1 (somatomedin C) 

4q31 

5q31* 

3 

FISH-SP 

Di Meo et al., 2000; El Nahas et al., 1999 

IGF2 

insulin-like growth factor 2 (somatomedin A) 

5pl9 

29q24 

7 

FISH 

Iannuzzi et al., 2001c 

IGF2R 

insulin-like growth factor 2 receptor 

10q26 

9q26 

2 

FISH 

Iannuzzi et al., 2001c 

IGFBP3 

insulin-like growth factor binding protein-3 

8q24 

4p26* 

13 

FISH 

Di Meo et al., 2000 

IGH@ 

immunoglobulin heavy locus 

20q24 

21q24 

4 

FISH 

Iannuzzi et al., 2001c 

IGHG 

immunoglobulin gamma heavy chain 

20q22-24 

21q24 

4 

ISH 

Hassanane et al., 1993 

IL1B 

interleukin 1, beta 

12q26 

1 lq 22-24 

16 

FISH 

Iannuzzi et al., 2000c 

IL2RA 

interleukin 2 receptor, alpha 

14q 13 

13q 13 

11 

FISH 

Iannuzzi et al., 2001c 

IL3 

interleukin 3 

9q 15 

7q 15-21 

22 

FISH 

present study 

INHA 

inhibin A subunit 

2q 

2q36-42 

17 

SP 

de Hondt et al., 2000 

INHBA 

inhibin, beta A (activin A, activin AB alpha polypeptide) 

8q24 

4q26 

13 

FISH-SP 

Iannuzzi et al., 2001c; Othman and El Nahas 1999 

KRT 

keratin 

4q21 

5q 14-23 

3 

FISH 

Di Meo et al., 2000 

KRTAP8 

keratin associated protein HGT-F 

1 q 12 

1 q 12 

10 

FISH 

Iannuzzi et al., 2000a 

L08239 

human EST L08239 

Xq34 

Xq31 

X 

FISH 

Iannuzzi et al., 2000b 

LAMP 

lysosome-associated membrane protein 145F2 

Xql2 

Xp23-24 

X 

FISH 

Iannuzzi et al., 2000b 

LDHA 

lactate dehydrogenase A 

5p 

29 

7 

SZ 

El Nahas et al., 1999 

LDHB 

lactate dehydrogenase B 

4q 

5 

3 

SZ-SC 

de Hondt et al., 1991; El Nahas et al., 1993, 1999 

LDLR 

low-density lipoprotein receptor 

9 

7 

22 

SP 

de Hondt et al., 1997 

LGB 

lactoglobulin, beta 

12q38 

1 lq28 

16 

FISH-SP 

Iannuzzi et al., 2001c; Othman and El Nahas 1999 

LHB 

luteinizing hormone beta polypeptide 

18q24 

18q24* 

9 

FISH 

Iannuzzi et al., 2001a 

LIF 

leukemia inhibitor factor 

17q 13 

17q 12 

23 

FISH 

Iannuzzi et al., 2001a 

LPO 

lactoperoxidase 

3p 13 

19q 13 

21 

FISH 

Iannuzzi et al., 2001a 

LTF 

lactotransferrin 

21q24 

22q24 

12 

FISH 

Iannuzzi et al., 2001c 

LYZ 

lysozyme 

4q23 

5q23 

3 

FISH 

Iannuzzi et al., 1993b 

MAP IB 

microtubule associated protein IB 

19q 13 

20q 13.1 

14 

FISH 

Iannuzzi et al., 1998a; 2001c; 

MAP2 

microtubule associated protein 2C 

3p22 

19q21-22* 

21 

FISH-SP 

Iannuzzi et al., 2001a; de Hondt et al., 2000 

MBP 

myelin basic protein 

22 

24ql 1-13.2 

28 

SR 

El Nahas et al., 1996a 

MCM104 

DNA segment 

18q24 

18q24* 

9 

FISH 

present study 

MCM218 

DNA segment 

8q 15 

4ql4* 

13 

FISH 

present study 

MCM74 

DNA segment 

Xq44 

Xq35 

X 

FISH 

Iannuzzi et al., 2000b 

ME1 

malic enzyme 

10 

9 

2 

SZ 

de Hondt et al., 1991; El Nahas et al., 1998 

MMP1 

matrix metalloproteinase 1 

16q 13 

15ql2* 

19 

FISH 

Iannuzzi et al., 2001a 

MT2A 

metallothionein 2 A 

18q 15 

18q24* 

9 

FISH 

Iannuzzi et al., 2001a 

MTP 

microsomal triglyceride transfer protein (large polypeptide, 
88kD) 

7q21 

6ql5* 

15 

FISH 

Di Meo et al., 2000 

MYC 

v-myc avian myelocytomatosin viral oncogene 

15q 16-17 

14q 16 

24 

FISH 

Iannuzzi et al., 2001a 

NCK1 

NCK adaptor protein 1 

lq45 

lq43* 

10 

FISH 

Iannuzzi et al., 2000a 

NF1 

neurofibromin 1 (neurofibromatosis, von Recklinghausen 
disease, Watson disease) 

3p 13 

19q 14 

21 

FISH 

Iannuzzi et al., 1999 

NGFB 

nerve growth factor, beta polypeptide 

6q23 

3q23 

6 

FISH 

Iannuzzi et al., 2000a 

NP 

nucleoside phosphorylase 

11 q 15 

10q21 * 

5 

FISH-SZ 

Iannuzzi et al., 2001a; El Nahas et al., 1998 

NPR3 

natriuretic peptide receptor C (ANPRC) 

19q 19 

20ql7* 

14 

FISH 

present study 

NRAS 

neuroblastoma RAS viral (v-ras) oncogene homolog 

6q25 

3q23* 

6 

FISH 

Iannuzzi et al., 2000a 

OarAE 101 

DNA segment 

7q21 

6ql8* 

15 

FISH 

present study 

OarCP09 

DNA segment 

15q24 

14q24* 

24 

FISH 

present study 

OarCP 16 

DNA segment 

17q26 

17ql3* 

23 

FISH 

present study 

OarCP73 

DNA segment 

2p 13 

23ql3* 

20 

FISH 

present study 

OarFCB5 

DNA segment (OARFCB005) 

4q21 

5q21* 

3 

FISH 

present study 

OarHH35 

DNA segment 

8q32 

4q31* 

13 

FISH 

present study 

OarHH56 

DNA segment 

2p24 

23q24* 

20 

FISH 

present study 

OarJMP58 

DNA segment 

lp25 

27q23* 

25 

FISH 

present study 

OCAM 

opioid binding and cell adhesion molecule 

5p 15 

29q22* 

7 

FISH-SP 

Iannuzzi et al., 2000a; El Nahas et al., 1999 

OPN1SW 

(BCP) 

opsin 1 (cone pigments), short-wave-sensitive (color 
blindness, tritan) 

8q32 

4q32* 

13 

FISH 

Di Meo et al., 2000 

OXT 

prepro-oxitocin 

14q22 

13q21 -22 

11 

FISH-SP 

Iannuzzi et al., 2001b; de Hondt et al., 1997 

P4HB 

procollagen-proline, 2-oxoglutarate 4-dioxygenase (proline 4- 
hydroxylase), beta polypeptide (protein disulfide isomerase; 
thyroid hormone binding protein p55) 

3p24 

19q22 

21 

FISH 

Iannuzzi et al., 1999 

PAX6 

paired box homeotic gene 6 

16q29 

15q27* 

19 

FISH 

Iannuzzi et al., 2001a 
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Locus symbol 

Locus name 

BBU 

BTA 

CHI* 
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Mode 3 
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PDE6B 

phosphodiesterase 6B, cGMP-specific, rod, beta (congenital 
stationary night blindness 3, autosomal dominant) 

7q36 

6q36* 

15 

FISH 

Di Meo et al., 2000 

PGD 

phosphogluconate dehydrogenase 

5q23 

16q21* 

1 

FISH-SZ 

Iannuzzi et al., 2001a; El Nahas et al., 1999 

PGK1 

phosphoglycerate kinase 1 

Xq32-33 

Xq25-26 

X 

FISH 

Iannuzzi et al., 2000b, 2001c 

PGM3 

phosphoglucomutase 

10 

9 

2 

SZ 

El Nahas et al., 1998 

PGY3 

P glycoprotein 3/multiple drug resistance 

8 

4 

13 

SR 

El Nahas et al., 1996a 

PIGR 

polymeric immunoglobulin receptor 

5q 12 

16q 12 

1 

FISH 

Iannuzzi et al., 2001c 

PKM2 

pyruvate kinase, muscle 2 

11 

10 

5 

SZ 

El Nahas et al., 1998 

PLAT 

plasminogen activator, tissue 

lp25 

27q 18-19 

25 

FISH 

Iannuzzi et al., 2000a 

PLP3(PLP1) 

proteolipid protein 1 (Pelizaeus-Merzbacher disease, spastic 
paraplegia 2, uncomplicated) 

Xq24-25 

Xq 11-12 

X 

FISH 

Iannuzzi et al., 2000b 

PNMT 

phenylethanolamine N methyl transferase 

3p22 

19ql7* 

21 

FISH 

Iannuzzi et al., 2001a 

PRL 

prolactin 

2p 

23 

20 

SP 

Othman and El Nahas, 1999 

PRNP 

prion protein (p27-30) (Creutzfeld-Jakob disease, Gerstmann- 
Strausler-Scheinker syndrome, fatal familial insomnia) 

14q 15 

13q 15 

11 

FISH-SP 

Iannuzzi et al., 1998b; de Hondt et al., 1997 

PROC 

map protein C 

2q 12 

2ql2 

17 

FISH 

Iannuzzi et al., 2000c 

RASA1 

GTPase activating protein RAS p21 

9q2.14 

7q25 

22 

FISH 

Iannuzzi et al., 2001a, 2001c 

RBI 

retinoblastoma 1 

13q 13 

12q 13 

27 

FISH 

Iannuzzi et al., 2001c 

RBP3B 

retinol binding protein 3, interstitial 

4p 18 

28ql9* 

29 

FISH-SP 

present study 

RHD 

rhesus blood groups, D antigen 

2q35 

2q45 

17 

FISH 

Iannuzzi et al., 2001a 

RM006 

DNA segment (RM006) 

9q 15 

7ql5* 

22 

FISH 

present study 

RYR1 

ryanodine receptor 1 

18q23-24 

18q23-24 

9 

FISH 

Iannuzzi et al., 2001a 

S100A6 

(CACY) 

SI00 calcium-binding protein A6 (calcyclin) 

6q 15 

3q21* 

6 

FISH 

Iannuzzi et al., 2000a 

SERPINA1 

protease inhibitor 1, alpha-1-antitrypsin 

20q24 

21q24* 

4 

FISH 

Iannuzzi et al., 2001a 

SGCG 

(DAGA4) 

dystrophin associated glycoprotein, gamma sarcoglycan 

13q 15 

12q 15-21* 

27 

FISH 

Iannuzzi et al., 2001a 

SLC25A6 

solute carrier family 25 (mitochondrial carrier; adenine 
nucleotide translocator), member 6 

Xq46-47 

Yq21-22 

Xq43 

Yp23 

X 

FISH 

Iannuzzi et al., 2000b 

SOD1 

superoxide dismutase 1, soluble (amyotrophic lateral 
sclerosis 1 (adult) 

1 q 12-13 

1 q 14 

10 

FISH 

Iannuzzi et al., 2001c 

SOD2 

superoxide dismutase 2 

10 

9 

2 

SZ 

de Hondt et al., 1991 

SRCRSP5 

DNA segment (SRCRSP5) 

20ql5 

21q14* 

4 

FISH 

present study 

SRY 

sex determining region Y 

Yql7 

Yql2.3 

Y 

FISH 

present study 

TCRB 

T-cell receptor beta cluster 

8* 

4 

13 

SR 

El Nahas et al., 1996a 

TG 

thyroglobulin 

15q 13 

14q 13 

24 

FISH 

Iannuzzi et al., 2001c 

THBD 

thrombomodulin 

14q 15 

13q 17 

11 

FISH 

Iannuzzi et al., 2001b 

TNFRSF6 

tumor necrosis factor receptor superfamily, member 6 

23q 13 

26q 13 

26 

FISH 

Iannuzzi et al., 2001c 

TOPI 

topoisomerase (DNA) I 

14q22-23 

13q22-23 

11 

FISH 

Iannuzzi et al., 2001b 

TP53 

p53 tumor suppressor phosphoprotein 

3pl5 

19q15-16 

21 

ISH 

Iannuzzi et al., 1999 

TPI1 

triose-phosphate isomerase 1 

4q 

5 

3 

SZ-SC 

de Hondt et al., 1991; El Nahas et al., 1993 

TPM1 

alpha tropomyosin 

1 lq26 

10q26* 

5 

FISH 

Iannuzzi et al., 2001a 

TRB@ 

T-cell receptor beta cluster 

8q34 

4q34 

13 

FISH 

Antonacci et al., 2001 

UMPS 

uridine monophosphate syntase 

1 q31 

Iq31q36 

10 

FISH 

Iannuzzi et al., 1994b 

uox 

urate oxidase 

6q31 

3q32* 

6 

FISH 

Iannuzzi et al., 2000a 

VIL 

villin 

2q33 

2q43 

17 

FISH 

Iannuzzi et al., 1997d; 2001c 

VIM 

vimentin 

14q 15 

13q 16-17 

11 

FISH 

Iannuzzi et al., 2001b 

WT1 

Wilms tumor 1 

16q29 

15q26* 

19 

FISH 

Iannuzzi et al., 2001a 

X81804 

zinc finger protein 

18q24 

18q24 

9 

FISH 

Iannuzzi et al., 1997e 

XBM31 

DNA segment 

Xq45 

Xq42 

X 

FISH 

Iannuzzi et al., 2000b 

YES1 

Yamagushi sarcoma viral oncogene homo log 1 

22 

24 

28 

SR 

El Nahas et al., 1996a 

ZFY 

zinc finger protein Y-linked 

Yq21 

Ypl2.2 

Y 

FISH 

Iannuzzi et al., 2001c 

ZNF164 

zinc finger protein 164 

17q24 

17q24 

23 

FISH 

Iannuzzi et al., I997e 


FISH: direct assignment by fluorescence in situ hybridization; ISH: direct assignment by isotopic in situ hybridization; S: indirect assignment by using somatic cell hybrids; 
P: polymerase chain reaction; Z: isozyme electrophoresis; R: restriction endonuclease; I: immuno fluorescence; C: chromosome identification 


Table 1 lists all mapped loci in river buffalo assigned by 
using varying techniques. Assignments in river buffalo are 
compared with those reported in cattle (or goat) on the basis of 
BovBase and GoatBase. Essentially, all in situ mapped loci 
were localized in homeologous chromosomes and chromosome 
regions. The only exception has been COL9A1 which maps to 
different chromosomes due to a simple translocation event of a 
small pericentromeric region, that occurred between Bovinae 
(cattle and river buffalo) chromosome 9 and Caprinae (goat 
and sheep) chromosome 14 (Iannuzzi et al., 2001a), as also 


revealed by comparative linkage mapping analyses between 
sheep and cattle (de Gortari et al., 1998). 

A discrepancy was found when comparing data obtained 
with D24S3, assigned to BBU8 (U13) by somatic cell hybrid 
analysis (de Hondt et al., 2000; El Nahas et al., 2001), and to 
BTA24 (U28, BovBase). Since several markers from both U13 
(BTA4) and U28 (BTA24) have been assigned to BBU8 and 
BBU22, respectively (Fig. 2, Table 1), D24S3 should be as¬ 
signed to BBU22. 
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Fig. 2. R-banded river buffalo ideogram with all mapped loci. Type I loci 
are reported in normal character while type II loci are reported in italics. 
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Almost all FISH-mapped loci were localized in R-positive 
bands (euchromatin rich regions) (Fig. 2). This agrees with the 
results of human painting probes, which hybridized almost all 
in R-positive bands where the conserved sequences are particu¬ 
larly present (Iannuzzi et ah, 1998c). 

Including the assignments herein, a total of 293 loci (171 of 
type I and 122 of type II) have been assigned to river buffalo 
extending our knowledge regarding the physical organization of 
its genome (Table 1, Fig. 2). Of the 293 loci, 247 were mapped 
by in situ hybridization (245 by FISH), 15 by both FISH and 
somatic cell hybrid analysis, the remaining ones by using 
somatic cell hybrid analysis alone. 


Because the river buffalo karyotype has some biarmed chro¬ 
mosomes that serve as marker chromosomes for hardly distin¬ 
guishable acrocentric chromosomes of other bovids, the river 
buffalo cytogenetic map and R-banded ideogram (Fig. 2) are 
particularly useful tools in confirming assignments and chro¬ 
mosome identification among the Bovidae (Piumi et al., 1998; 
Schibler et al., 1998; Di Meo et al., 2000; Iannuzzi et al., 2000a, 
b, 2001a). Maps of some river buffalo chromosomes (or chro¬ 
mosome arms) (lp, 3p, 5p, 5q, 16, 18, 20, X) are obviously 
denser than others. Indeed, more work is necessary to increase 
the number of marker assignments to this species, especially on 
chromosomes lq, 2q, 3q, 4q, 7, 8, 9, 12, 21, 22, 23 and 24. 
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Abstract. Genome maps in livestock species have been 
under development for the last decade. While the sheep map is 
one of the least advanced for livestock, the amount of available 
information is noteworthy, in light of the paucity of funding 
and personnel devoted to this project. These limited resources 
have been strategically aligned to take advantage of informa¬ 
tion from the human, mouse and bovine mapping and sequenc¬ 


ing efforts. The resulting ovine linkage and physical maps have 
greatly enhanced the search for genes controlling important 
traits in sheep. In order to improve the efficiency of these inves¬ 
tigations, it is imperative that efforts on the sheep comparative 
map be continued. 

Copyright©2003 S. Karger AG, Basel 


Ovine linkage maps 

Development of the linkage map for sheep has been critical 
for genomic studies in this species. Prior to 1994, only 17 mark¬ 
ers were assigned to seven syntenic groups (Broad et al., 1997). 
In 1994, 19 linkage groups containing 52 markers including 
microsatellites and candidate gene restriction fragment length 
polymorphisms (RFLPs) were identified (Crawford et al., 
1994). These assignments were a consequence of a genome scan 
initiated to map the Booroola fecundity gene (Montgomery et 
al., 1993) using 12 pedigrees segregating for the Booroola gene. 
A more extensive ovine linkage map was published by this 
group in 1995 (Crawford et al., 1995) and contained 246 mark¬ 
ers (86 ovine microsatellites, 126 bovine microsatellites, one 
deer microsatellite, and 33 known genes), with marker spacing 
between 10 and 30 cM across the chromosomes. Total coverage 
of the map was 2070 cM (about 75 % of the genome) and mark¬ 
ers were assigned to all 26 sheep autosomes. This map was con- 
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structed using the AgResearch International Mapping Flock 
(IMF). 

Three years later, a second-generation ovine genetic map 
was published by the USDA, ARS group in Nebraska (de Gor- 
tari et al., 1998). The map contained 519 markers (402 bovine 
microsatellites, 101 ovine microsatellites, and 16 known genes) 
and spanned 3063 cM across the autosomes, with an average 
marker spacing of 6.5 cM (http://sol.marc.usda.gov/genome/ 
genome.html). 

A third generation map was developed through a collab¬ 
oration of 15 laboratories from across the world using ani¬ 
mals from the IMF (Maddox et al., 2001). The map contained 
1,062 loci (941 anonymous loci and 121 genes), and spanned 
3,400 cM (sex-averaged) for the autosomes and 132 cM (fe¬ 
male) on the X-chromosome. The IMF map is continually 
being expanded. The most recent publication (Maddox et al., 
2003) contained 1,221 loci, including 198 genes and 38 initially 
anonymous ESTs (http://rubens.its.unimelb.edu.au/ —jillm/jill. 
htm). 

It should be noted that these ovine linkage maps include 
relatively few expressed genes (~ 200) because of the difficulty 
in identifying allelic variation needed for linkage analyses. 
Additional assignments for about 250 genes have been made 
using somatic cell hybrids (Saidi-Mehtar et al. 1979; Burkin et 
al., 1991) and in situ hybridization (Table 1). 

The ovine linkage map contains a large number of bovine- 
derived microsatellite markers. However, not all bovine micro¬ 
satellites are usable. Some do not amplify ovine genomic DNA, 
no doubt because of divergence in the primer sequences. Other 
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bovine microsatellites are polymorphic in cattle, but mono- 
morphic in sheep. Crawford et al. (1995) found that 133 (53%) 
of 251 cattle microsatellites screened were polymorphic in 
sheep. In another study by de Gortari et al. (1997), 58% (605/ 
1036) of the tested bovine microsatellites amplified a locus in 
sheep. Of the amplified loci, 67% (409/605) were informative 
in the sires of the IMF and USDA ovine mapping families. 

Sequence divergence between cattle and sheep also creates 
problems when using bovine STSs derived from BACs, YACs 
and ESTs on sheep DNA and vice versa. For example, develop¬ 
ment of a contig for the callipyge locus in sheep was hampered 
by this sequence divergence. The original plan was to isolate 
ovine BACs, develop STSs and order the ovine STSs using the 
bovine 5,000 and 12,000 rad RH panels. In this way, the corre¬ 
sponding location of callipyge on the human and mouse 
genomes could be inferred. Unfortunately, only 10 of 26 (38%) 
ovine STSs amplified bovine genomic DNA and so only these 
could be typed on the RH panels and used for orientating the 
contig. Because of this limitation, the experimental approach 
was revised. A bovine-specific BAC contig was developed 
(Shay et al., 2001), the resulting bovine STSs were oriented 
with the bovine RH panels, and then an ovine-specific BAC 
contig was constructed using a limited number of bovine STSs 
that amplified ovine genomic DNA (Segers et al., 2000). 

Ovine radiation hybrid map 

An ovine radiation hybrid (RH) panel has been developed 
at INRA, France (Andre Eggen, personal communication). 
This panel will be suitable for physical contig construction and 
fine mapping of candidate regions for QTL and single gene 
traits. However, development of a whole-genome map using 
this panel will be difficult because the high radiation dose 
(10,000 rad) creates relatively small DNA fragments, requiring 
marker coverage that is much more dense than what is needed 
for whole-genome mapping. Therefore, a low rad RH panel is 
currently being developed in a joint collaboration between 
Utah State University and Texas A&M University. Microsatel¬ 
lites that have been previously linked to single-gene traits and 
QTLs will be localized on the resulting framework map. This 
map will be extended to include ESTs that have been produced 
by large-scale sequencing of ovine cDNAs. The combination of 
microsatellites and ESTs onto a single RH map will provide a 
direct link between the sheep and human/mouse maps. In this 
way, genes influencing economically important traits in sheep 
can be identified using positional candidate cloning strategies. 
An equally important outcome of this project will be the com¬ 
parison of locus order between two ruminant species (sheep 
and cattle), and between sheep and humans, providing addi¬ 
tional information for understanding chromosomal evolution. 

Comparative maps 

There are some estimates of gene order for sheep and cattle. 
In one comparison (de Gortari et al., 1998), marker order was 
inverted eight times (> 2 cM intervals) when the sheep map was 


Table 1 . Sheep genome statistics 3 


Chromosomes 

29 autosomes, X, Y 

Predicted genome size 

3,400 cM 

Linkage map 

Genes 

200 

Random markers 

1,150 

Physical map 

Somatic cell hybrid 

250 genes and markers 

Cytogenetic map 

850 genes and markers 

Comparative map 

300 loci 

EST sequences 

In database 

6,750 

BAC libraries 

3 


Data predicted from combination of 
published and unpublished information. 


compared to the cattle linkage map (Kappes et al., 1997). 
Length of the inversions ranged from 3 to 57 cM. This group 
also found six microsatellites that mapped to sheep linkage 
groups other than what would be expected based on the bovine 
homolog. The authors suggest that these may be the result of 
amplification of a nonhomologous sequence by heterologous 
primers. 

Conserved chromosomal segments between sheep and hu¬ 
mans have been determined using human chromosome paint¬ 
ing probes on R-banded sheep chromosomes (Iannuzzi et al., 
1999) and by probing Indian muntjac deer chromosomal prep¬ 
arations with human and sheep painting probes (Burkin et al., 
1997). A total of 48 human segments were found in sheep chro¬ 
mosomes. This estimate has been further refined (Maddox et 
al., 2003) to include 53 sheep-human chromosome combina¬ 
tions. 

Ovine informational database 

An informational database that includes mapped loci in 
sheep is now available through the Roslin Institute (United 
Kingdom): http://www.thearkdb.org or Texas A&M University 
(USA): http://texas.thearkdb.org. The current version of Sheep- 
Base contains about 1722 loci gathered from 518 publications. 
There are about 370 designated genes, 1327 Type II loci and 25 
other loci (such as blood group polymorphisms) in the data¬ 
base. Of the 2832 map assignments, 1988 were done by linkage 
and 844 by cytogenetic methods. 

Conclusion 

Genome maps in livestock species have been under devel¬ 
opment for the last decade. Development of the ovine genome 
map has allowed researchers to map numerous traits to specific 
chromosomal regions and in many cases, to identify the causa¬ 
tive mutations. Connecting the ovine genome map with maps 
of other species will further advance these studies. 
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Abstract. 65,000 sheep skin cDNA clones were gridded in 
high density on to nylon membranes and screened for (CA) n 
and (GA) n repeat containing clones. 296 dinucleotide repeat- 
containing clones were identified with ~ 85% non-redundan¬ 
cy. Clones were single-pass 5 7 sequenced and we compared the 
Expressed Sequence Tag (EST) sequences to the Swiss-Prot 


database to ascertain their identity and/or putative function. 
We then aligned the ESTs against the human genomic sequence 
to determine the locations of human orthologous sequences. 
Finally, we developed a subset of polymorphic microsatellite 
markers and positioned them on the ovine linkage map. 

Copyright©2003 S. Karger AG, Basel 


Sheep were among the earliest domesticated animals and 
still play a major role in animal agriculture around the world. A 
1994 census claimed that over a billion sheep are being raised 
around the world (Piper and Ruvinsky, 1997). Aside from their 
agricultural importance, sheep play a role as animal models for 
some inherited human diseases (Tan et al., 1997; Broom et al., 
1998; Wright et al., 1999). In spite of this, the linkage map of 
the sheep genome lags far behind that of humans, model organ¬ 
isms, and other agricultural animals such as pigs and cattle. The 
current sheep map has over 1,200 markers, but only 15% of 
these correspond to known genes (Maddox et al., 2001, 2002). 
One reason for the small number of markers derived from 
expressed sequences in the linkage map is the difficulty in iden¬ 
tifying allelic variants within gene sequences (Cockett et al., 
2001 ). 
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The availability of several thousand expressed sequence tags 
(ESTs) in public databases, and the observation that some of 
them carry hypervariable repetitive sequences in them, have 
prompted several groups to mine databases for microsatellite 
containing EST sequences as a source of genetic markers in 
organisms as varied as turkeys and pigs to barley, wheat, apri¬ 
cots and grapes (Holton et al., 2002; Rohrer et al., 2002; 
Decroocq et al., 2003; Dranchak et al., 2003). 

An advantage of these EST-derived microsatellite markers 
is that they are both type I (gene) and type II (variable, usually 
non-coding) markers. They are invaluable in cross-referencing 
genomes of related species and in comparative genomics. 

We generated three cDNA libraries from adult and fetal 
sheep skin to study gene expression during wool follicle initia¬ 
tion and development (Adelson et al., 2003). 65,000 randomly 
picked clones from these libraries were systematically screened 
for dinucleotide repeat containing transcripts. Over 250 such 
transcripts were isolated and the development and character¬ 
ization of a sub-set of these as genetic markers is described 
here. 

Materials and methods 

cDNA library construction and arraying 

Skin samples were removed from adult sheep and sheep fetuses at two 
different lengths of gestation (73-75 days and 103-105 days). Total RNA 
was extracted using a method described previously (Chomczynski and Sac- 
chi, 1987). Three different, polyT primed cDNA libraries were constructed 
using a commercial vendor (Clontech, Palo Alto, CA) in MTipleEX cloning 
vector. 
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The libraries were converted en-masse to pTriplEX using the protocol 
provided by Clontech (Protocol PT3003-1, http://www.clontech.com). Con¬ 
verted libraries were plated out in 245 x 245 mm Bioassay plates (Nunc, 
Denmark) and ~ 65,000 clones (~ 20,000 clones/library) were picked into 
3 84-well microtitre plates using a colony picking service (AGRF, Melbourne, 
Australia). 

Gridding of high-density Jitters 

Clones from arrayed cDNA libraries were gridded onto 12 x 8 cm nylon 
membranes (Hybond, Amersham Biosciences, Piscataway, NJ) laid on LB 
agar plates using a Biomek 2000 robotic workstation (Beckman Instruments, 
Fullerton, CA). Clones were gridded in a 5 x 5 design enabling 12 384-well 
plates of clones to be gridded in duplicate (4,608 individual clones) on each 
filter. Gridded clones were grown on filters overnight at 37 °C. Bacterial 
clones were lysed and DNA immobilized on nylon membranes by placing the 
membranes for 5 min at each step on filter paper soaked in following solu¬ 
tions. (1) 10% SDS, (2) 0.5 M NaOH, 1.5 M NaCl, (3) 0.5 M Tris (pH 8), 
1.5 M NaCl, (4) 0.5 M Tris (pH 8), 1.5 M NaCl (step 3 repeated) (5) 2x SSC, 
0.5 % SDS, (6) 2x SSC. Filters were then dried on fresh filter paper and UV 
cross-linked before use. 

Probe preparation and hybridization 

Synthetic (CA) 10 and (GA)io oligonucleotides were 5' end labeled with 
[a- 32 P]dATP using T4 polynucleotide kinase (Roche Applied Sciences, Cas¬ 
tle Hill, Australia) according to manufacturer’s instructions. Following over¬ 
night hybridization at 50 ° C in a standard buffer containing 50 % Denhardt’s 
solution, membranes were washed three times (15 min each) in 6x SSC at 
room temperature followed by a single, short (1-2 min) wash in 6x SSC at 
50 °C. Membranes were exposed to Kodak X-OMAT autoradiography film 
for 1.5 h to overnight. 

Isolation and sequence analysis ofcDNA clones 

Clones identified by the hybridization screen were inoculated into 5 ml 
of 2x TY medium and grown overnight at 37 0 C. Plasmid DNA was isolated 
using a Promega Wizard 96 DNA isolation kit (Promega, Madison, WI, 
USA). All DNA sequencing was performed on an ABI 373XL DNA sequenc¬ 
er using fluorescently labeled dye-terminators (Applied Biosystems, Foster 
City, CA) and pTriplEX sequencing primers (Clontech, Palo Alto, CA, USA). 
Chromatograms were transferred to a UNIX workstation for base calling, 
quality clipping, vector clipping and repeat masking with Phred, Phrap (Ew¬ 
ing et al., 1998; Ewing and Green, 1998) and RepeatMasker (Smit, 1996— 
2002) and clustering/contiging was done with the following command line: 
phredPhrap -penalty -15 -shatter_greedy -bandwidth 30 -minscore 100. 
Consed was used to manually edit/verify contigs (Gordon et ah, 1998). 

Sequence similarity searches 

We performed gapped BLAST (Altschul et ah, 1994, 1997) searches with 
HT-BLAST (http://www.sgi.com/industries/sciences/chembio/resources/pa- 
pers/HTBlast/HT_Whitepaper.html) on an SGI Origin 3800 48-processor 
machine. High-sensitivity BLASTN searches were performed against Build 
33 of the human genome using the settings of Ian Korf (http://sapiens.wustl. 
edu/~ ikorf/mmhs/index.html), -W 7 -r 17 -q -21 -f 280 -G 29 -E 22 -X 240. 
BLASTX searches were carried out using default settings against the Swiss 
Prot database. BLAST output was parsed into tabular form using a Perl script 
based on modules and examples provided in the BioPerl distribution (http:// 
www.bioperl.org/). 

Microsatellite marker development and verification 

PCR primers from sequence flanking the repetitive regions were de¬ 
signed with the assistance of Primer0.5 software (http://www.genome.wi. 
mit.edu/ftp/distribution/software/primer.0.5/) and were synthesized com¬ 
mercially with fluorescent labels (6-FAM, HEX or TET). 16 ng of genomic 
DNA was amplified in a 4-pl reaction volume consisting of 67 mM Tris-HCl 
(pH 8.8), 16.6 mM (NH 4 ) 2 S0 4 , 0.2 mg/ml gelatin, 0.45% Triton X-100, 
1.5 mM MgG. 2 , 100 pM dNTPs, 2.7 pmol of both the forward and reverse 
primers, 0.1 U Amplitaq DNA polymerase (Applied Biosystems, Melbourne, 
Australia), 0.22 pg of TaqStart antibody (Clontech, Palo Alto, CA, USA) and 
0.4 pCi [a- 33 P]dATP (Geneworks, Adelaide, Australia). Reactions were set 
up in a 96-well plate and run on a DNA thermal cycler (PTC-100, MJ 
Research Inc, Waltham, MA, USA), using the following conditions: one cycle 
of denaturation at 95 °C (2 min 50 s); 30 cycles of denaturation at 95 °C 
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Fig-1 . A high-density clone array probed with (CA) n and (GA) n probes. 
There are 4,608 individual clones (12 384-well plates) double spotted on to 
the 12 x 8 cm filter. The intensity of the hybridization signal corresponds to 
the length of the dinucleotide repeat present in the clone. 


(10 s), annealing at 58°C (30 s), extension 72°C (30 s); and one cycle of 
extension 72 0 C (2.5 min). 

PCR products were diluted 1:2 in formamide loading dye, heated at 
95 °C for 5-10 min and then analysed either by denaturing polyacrylamide 
(6% acrylamide:bis [19:1], 7 M urea, lx TBE), gel electrophoresis at 50°C 
using a StrataTherm (Stratagene, La Jolla, CA, USA) temperature controller, 
or on an ABI373 using Genescan for fragment analysis. 

Mapping 

The markers were genotyped on individuals from the International Map¬ 
ping Flock (IMF), and on a set of reference families developed by CSIRO 
Livestock Industries. The IMF consists of 9 three-generation families com¬ 
prising a total of 126 individuals (Crawford et al., 1995), whereas the CSIRO 
flock comprises 15 full-sib families with 202 individuals, and 231 backcross 
individuals in two families. The backcrosses were used to estimate gene fre¬ 
quencies in a population of randomly mated fine-wool Merinos, and PIC 
values were calculated from these frequencies (PIC-1, Table 3). In addition, 
allele distribution was determined for a population of ~ 50 unrelated sheep 
from ten different breeds (Merino, Border Leicester, Suffolk, Romney, Kara¬ 
kul, Finnish Landrace, Poll Dorset, Dorset, Texel and Carpet Master), and 
PIC-2 was calculated from these. Strictly, this second set of PIC values does 
not provide a valid estimate of the Polymorphic Information Content (An¬ 
derson et al., 1993), as originally defined, but these PIC-2 values do give a 
measure of allelic diversity across breeds. Linkage analysis in the IMF was 
performed using CRIMAP v2.4 (Green, 1992) and MultiMap (Matise et al., 
1994) using Gen files from the current version of the map (4.1). The data 
from the CSIRO families were analyzed using in-house software; in general, 
there was good agreement between the map positions estimated in the two 
mapping reference flocks. The map positions presented in Table 3 are from 
the IMF flock. 

Results and discussion 

Identification of dinucleotide repeat containing cDNA 
clones 

We screened ~ 65,000 randomly selected sheep skin cDNA 
clones from three distinct developmental stages for transcripts 
with CA and/or GA dinucleotide repeats. RNA was extracted 
from three developmental stages; adult, fetal skin at 73-75 days 
of gestation and fetal skin at 103-105 days. During the initial 
screen, 296 clones (0.46% of the clones screened) hybridized 
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Table 1. Clones that had an orthologous protein present in Swiss-Prot database. The organism from which the ortholog was identified, Swiss-Prot 
accession number for the protein, it’s identity, bit score, E-value and percent identity for the similarity are given. Only non-redundant hits with E-value of 
IE-05 or better are shown. 


EST Clone 

Organism 

Accession no. 

Description 

Score 

E-value 

% Identity 

SSFS061F10 

Homo sapiens 

Q92598 

Heat-shock protein 105 kDa 

382 

l.E-106 

86 

SSFS063K12 

Equus caballus 

P19854 

Alcohol dehydrogenase class III chain (EC 1.1.1.1) 

377 

l.E-104 

94 

SSFS067L06 

Ovis aries 

021619 

Cytochrome c oxidase polypeptide III (EC 1.9.3.1) 

333 

4.E-94 

80 

SSFS046G21 

Homo sapiens 

P12750 

40S ribosomal protein S4, X isoform 

259 

3.E-69 

91 

SSFS030N20 

Bos taurus 

Q05443 

Lumican precursor (Keratan sulfate proteoglycan) 

227 

2.E-59 

86 

SSFP025C04 

Homo sapiens 

014933 

Ubiquitin-conjugating enzyme E2-18 kDa UbcH8 

202 

3.E-54 

70 

SSFP002J14 

Homo sapiens 

Q9Y3U8 

60S ribosomal protein L36 

191 

2.E-48 

98 

SSFS064P04 

Mus muscuius 

P30412 

Peptidyl-prolyl cis-trans isomerase C (EC 5.2.1.8) 

189 

6.E-48 

91 

SSFS046H18 

Stichopus japonica 

P21251 

Calmodulin 

101 

2.E-37 

96 

SSFS064E02 

Homo sapiens 

Q9UHE8 

Six transmembrane epithelial antigen of prostate 

153 

3.E-37 

42 

SSFS035N24 

Mus mus cuius 

P30412 

Peptidyl-prolyl cis-trans isomerase C (EC 5.2.1.8) 

153 

5.E-37 

83 

SSFS064N04 

Bos taurus 

P19120 

Heat shock cognate 71 kDa protein 

142 

l.E-35 

61 

SSA007P22 

Mus musculus 

Q8VH51 

RNA-binding region containing protein 2 

147 

3.E-35 

72 

SSFP023A21 

Homo sapiens 

015400 

Syntaxin 7 

110 

4.E-24 

100 

SSFS064M20 

Ovis aries 

019097 

Selenoprotein W 

109 

6.E-24 

94 

SSFP006B23 

Ovis aries 

Q9XSY9 

Osteopontin precursor (Bone sialoprotein 1) 

109 

6.E-24 

92 

SSFS064A13 

Saccharomyces cerevisiae 

P47033 

PRY3 protein (Pathogen related in Sc 3) 

80 

4.E-15 

41 

SSFS068F03 

Bos taurus 

Q05443 

Lumican precursor (Keratan sulfate proteoglycan) 

59 

2.E-08 

100 

SSA002H09 

Rattus norvegicus 

P52909 

Transcription factor jun-D 

50 

7.E-06 

100 

SSFS034N08 

Hordeum vulgare 

P06353 

Histone H3 (Fragment) 

47 

5.E-05 

68 


with varying intensity to the radio-labeled (CA) n and (GA) n 
oligonucleotides (Fig. 1). Sequence analyses revealed that all 
296 clones contained CA and GA repeats. 251 sequences were 
unique (84.8% non-redundancy). This was especially interest¬ 
ing as none of the cDNA populations were normalized prior to 
cloning. Initial random sequencing of the same libraries 
showed up to 40% redundancy (Adelson et al., 2003). Insert 
sizes of the microsatellite containing clones ranged from 1,655 
to 490 nucleotides. The length of dinucleotide repeats con¬ 
tained within the clones ranged from three repeats to 28. The 
intensity of hybridization to probes correlates to the length of 
the dinucleotide repeat contained within the clone (data not 
shown), and most of the positive signals are significantly weak¬ 
er than the strong signals (Fig. 1). As the strong signals corre¬ 
sponded to longer length dinucleotide repeats, it was evident 
that longer and, hence more useful dinucleotide repeat contain¬ 
ing clones, were not in the majority. Given that 65,000 clones 
were screened to identify these clones, this type of screening is 
not an economical strategy for wholesale isolation of Type I 
microsatellite-containing markers. However, given that these 
markers can be used both for linkage/QTL mapping and as 
cross-genome anchor loci, this strategy may be worth consid¬ 
ering for organisms without large numbers of sequenced ESTs 
or available microsatellites. 

Identities of dinucleotide repeat-containing transcripts 

To ascertain the identity of these transcripts, all non-redun- 
dant transcripts were compared against the Swiss-Prot protein 
database (http://us.expasy.org/) using the gapped-BLAST pro¬ 
gram (Altschul et al., 1994, 1997; Boeckmann et al., 2003). 
Twenty clones were found to be similar to proteins in the Swiss- 
Prot database, but no hits were found for 259 clones. We used a 
similarity cut-off of expectation value (E value) of IE-20 or low¬ 
er to identify potential orthologs. This category included 16 
clones with the orthologs coming from human, horse, mouse, 


cattle, sheep and even sea cucumber proteins (Table 1). We are 
aware that our stringency for accepting an ortholog is fairly 
high, but we opted for a conservative approach. All 21 clones 
that had an orthologous protein in Swiss-Prot database are 
shown in Table 1, along with the identity of the ortholog, the bit 
score value (Altschul et al., 1994), the E value and % identity. 

We compared the repeat masked EST sequences with the 
human genomic DNA database in GenBank to identify the 
location of the human orthologs of these transcripts. The posi¬ 
tion of the orthologs of many of the transcripts could be identi¬ 
fied (Table 2). 14 of the 75 clones yielding hits generated multi¬ 
ple hits, indicating the probable existence of a conserved gene 
family. Putative orthologs were identified on the basis of cut off 
values of > 70 for score and/or E-value of <lE-39. Again, these 
cut off values are stringent, but 71 of the 7 5 clones exceeded 
these criteria. Of the clones having multiple hits, all but eight 
had the top hit E-value separated by twenty orders of magni¬ 
tude or more from the next best hit. 

Microsatellite marker development 

PCR primer pairs from sequence flanking the dinucleotide 
repeats were made from several transcripts. In addition, two 
microsatellite markers isolated serendipitously and previously 
characterized as part of the EST project (Adelson et al., 2003) 
were included in PCR primer pair development. 27 of these 
primer pairs were used to genotype the International Mapping 
Flock (IMF) (Crawford et al., 1995; Maddox et al., 2001), and 
22 of these were genotyped in the CSIRO families. Table 3 
gives details for the markers including their locus codes, chro¬ 
mosomal position in the sheep genome and PIC values. Table 4 
describes the PCR primers, size distribution of alleles and PCR 
conditions. The allele distribution of these markers ranged 
from 16 in CSAP034 to several with just two alleles. Polymor¬ 
phism information content (PIC) values ranged from 0.82 in 
CSAP036 and CSAP037 to 0.12 in CSAP014. 
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Table 2. Comparison against the human genomic sequence, build 33 from NCBI. Top hits for sequence similarity 
search using BLASTN. Clones with more than a single hit with an E-value of IE-20 or better are on a gray background, of 
these, the eight ambiguous orthologs are indicated in bold font. The clone, human chromosome hit, bit score, E-value, 
percent identity and starting coordinate for the hit are shown. 


Clone 

Hit 

Score 

E-value 

% Identity 

Coordinate 

SSA022L02 

Homo sapiens chromosome 15, complete sequence 

700 

0.E+00 

85% 

30758271 

SSFP002F19 

Homo sapiens chromosome 12, complete sequence 

934 

0.E+00 

89% 

69688912 

SSFP003B18 

Homo sapiens chromosome 17, complete sequence 

823 

0.E+00 

91% 

38201851 

SSFP004121 

Homo sapiens chromosome 16, complete sequence 

714 

0.E+00 

89% 

69473361 

SSFP032D02 

Homo sapiens chromosome 16, complete sequence 

680 

0.E+00 

88% 

69473361 

SSFS004D15 

Homo sapiens chromosome 5, complete sequence 

689 

0.E+00 

79% 

98221924 

SSFS024P16 

Homo sapiens chromosome 12, complete sequence 

721 

0.E+00 

82% 

69688491 

SSFS031B13 

Homo sapiens chromosome 15, complete sequence 

639 

0.E+00 

80% 

60942024 

SSFS038J08 

Homo sapiens chromosome 12, complete sequence 

669 

0.E+00 

84% 

102724498 

SSFS057A01 

Homo sapiens chromosome 6, complete sequence 

661 

0.E+00 

89% 

43149926 

SSFS063E16 

Homo sapiens chromosome 5, complete sequence 

821 

0.E+00 

92% 

177705801 

SSFS063K12 

Homo sapiens chromosome 1, complete sequence 

771 

0.E+00 

85% 

236630146 

SSFS064013 

Homo sapiens chromosome 17, complete sequence 

840 

0.E+00 

91% 

37472517 

SSFS064D20 

Homo sapiens chromosome 16, complete sequence 

687 

0.E+00 

89% 

69473407 

SSFS065118 

Homo sapiens chromosome 2, complete sequence 

648 

0.E+00 

83% 

36734738 

SSFP032F08 

Homo sapiens chromosome 9, complete sequence 

621 

l.E-175 

87% 

103231435 

SSFS004L22 

Homo sapiens chromosome 10, complete sequence 

597 

l.E-168 

83% 

72474963 

SSFS045F01 

Homo sapiens chromosome 9, complete sequence 

593 

l.E-167 

81% 

81196377 

SSFS010L05 

Homo sapiens chromosome 17, complete sequence 

585 

l.E-164 

84% 

66396986 

SSFS058P01 

Homo sapiens chromosome 17, complete sequence 

580 

l.E-163 

78% 

36987588 

SSFS039K12 

Homo sapiens chromosome 19, complete sequence 

559 

l.E-157 

78% 

45622797 

SSFS034N08 

Homo sapiens chromosome 1, complete sequence 

559 

l.E-156 

85% 

174743903 

SSFS033P24 

Homo sapiens chromosome 3, complete sequence 

555 

l.E-155 

86% 

115315685 

SSFP010E14 

Homo sapiens chromosome 20, complete sequence 

543 

l.E-152 

79% 

33536965 

SSFP006B23 

Homo sapiens chromosome 4, complete sequence 

539 

l.E-150 

82% 

89296650 

SSFS030019 

Homo sapiens chromosome 12, complete sequence 

535 

l.E-149 

86% 

13269508 

SSFS046N20 

Homo sapiens chromosome 16, complete sequence 

533 

l.E-149 

78% 

3777108 

SSFS063P23 

Homo sapiens chromosome 10, complete sequence 

522 

l.E-145 

75% 

5599805 

SSFS017H21 

Homo sapiens chromosome 10, complete sequence 

519 

l.E-144 

76% 

5599799 

SSFS037009 

Homo sapiens chromosome X, complete sequence 

516 

l.E-144 

81% 

103234356 

SSFP001M11 

Homo sapiens chromosome 1, complete sequence 

513 

l.E-143 

74% 

222090857 

SSFP005022 

Homo sapiens chromosome X, complete sequence 

511 

l.E-142 

78% 

103234419 

SSFS064108 

Homo sapiens chromosome 10, complete sequence 

501 

l.E-139 

73% 

3919363 

SSFS022107 

Homo sapiens chromosome 19, complete sequence 

484 

l.E-134 

85% 

45622885 

SSFS064N04 

Homo sapiens chromosome X, complete sequence 

483 

l.E-134 

83% 

118289623 

SSFP025M07 

Homo sapiens chromosome 17, complete sequence 

453 

l.E-125 

81% 

44361105 

SSFS046G21 

Homo sapiens chromosome 8, complete sequence 

454 

l.E-125 

83% 

48182532 

SSFP002J14 

Homo sapiens chromosome 14, complete sequence 

433 

l.E-119 

87% 

71386348 

SSFP003H07 

Homo sapiens chromosome X, complete sequence 

424 

l.E-116 

80% 

26736354 

SSFP030D04 

Homo sapiens chromosome 12, complete sequence 

420 

l.E-115 

77% 

100666304 

SSFS061F03 

Homo sapiens chromosome 15, complete sequence 

407 

l.E-111 

74% 

34271361 

SSFP012H21 

Homo sapiens chromosome 5, complete sequence 

400 

l.E-109 

81% 

132471595 

SSFS022A19 

Homo sapiens chromosome 12, complete sequence 

398 

l.E-108 

70% 

91430380 

SSFS010013 

Homo sapiens chromosome 2, complete sequence 

395 

l.E-107 

93% 

118774093 

SSFS030N20 

Homo sapiens chromosome 12, complete sequence 

379 

l.E-102 

69% 

91430382 

SSFS062I13 

Homo sapiens chromosome 17, complete sequence 

375 

l.E-101 

78% 

36987293 

SSFS067L06 

Homo sapiens chromosome 7, complete sequence 

352 

2.E-94 

70% 

56998785 

SSFP003N12 

Homo sapiens chromosome 1, complete sequence 

337 

6.E-90 

78% 

211453419 

SSFS044O18 

Homo sapiens chromosome 16, complete sequence 

335 

2.E-89 

68% 

3777333 

SSFP006L12 

Homo sapiens chromosome 4, complete sequence 

316 

l.E-83 

71% 

124713744 

SSA023D05 

Homo sapiens chromosome 9, complete sequence 

306 

2.E-80 

75% 

125132445 

SSFS047106 

Homo sapiens chromosome 2, complete sequence 

289 

2.E-75 

70% 

26570056 

SSFP003E24 

Homo sapiens chromosome 7, complete sequence 

273 

l.E-70 

80% 

99760497 

SSFP0025I02 

Homo sapiens chromosome 14, complete sequence 

269 

l.E-69 

80% 

97854128 

SSFS062F02 

Homo sapiens chromosome 4, complete sequence 

270 

l.E-69 

72% 

25410484 

SSA024I11 

Homo sapiens chromosome 11, complete sequence 

265 

3.E-68 

86% 

40201466 

SSFP017001 

Homo sapiens chromosome 2, complete sequence 

265 

3.E-68 

72% 

31969232 

SSFP005M14 

Homo sapiens chromosome 6, complete sequence 

261 

4.E-67 

69% 

10460030 

SSFP017L11 

Homo sapiens chromosome 8, complete sequence 

260 

l.E-66 

71% 

40862030 

SSFP004G06 

Homo sapiens chromosome 14, complete sequence 

245 

3.E-62 

89% 

71732558 

SSFS046H18 

Homo sapiens chromosome 13, complete sequence 

235 

4.E-59 

96% 

40681741 

SSA002H09 

Homo sapiens chromosome 19, complete sequence 

227 

9.E-57 

76% 

18236087 

SSA022007 

Homo sapiens chromosome 6, complete sequence 

226 

2.E-56 

97% 

127583278 

SSFS064N12 

Homo sapiens chromosome X, complete sequence 

225 

3.E-56 

66% 

8493885 

SSFP004E11 

Homo sapiens chromosome 3, complete sequence 

215 

4.E-53 

76% 

124543094 

SSFS045B09 

Homo sapiens chromosome 17, complete sequence 

199 

2.E-48 

65% 

48592289 

SSFP003M04 

Homo sapiens chromosome 18, complete sequence 

186 

2.E-44 

78% 

33856751 

SSFS043B12 

Homo sapiens chromosome 18, complete sequence 

181 

6.E-43 

76% 

33856751 

SSA020A13 

Homo sapiens chromosome 16, complete sequence 

140 

l.E-30 

91% 

30015184 

SSFS064P04 

Homo sapiens chromosome 5, complete sequence 

114 

7.E-23 

83% 

122390452 

SSFS001M02 

Homo sapiens chromosome 5, complete sequence 

81 

9.E-13 

76% 

81610639 

SSFS035N24 

Homo sapiens chromosome 1, complete sequence 

75 

5.E-11 

80% 

42595206 

SSFS057L20 

Homo sapiens chromosome 1, complete sequence 

70 

2.E-09 

75% 

63112955 

SSFS057L21 

Homo sapiens chromosome 4, complete sequence 

59 

5.E-06 

69% 

31850713 


82 
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Table 3. Polymorphic markers, clone names, 
locus names, PCR conditions, number of alleles 
and map position are given. PIC-1 and PIC-2 are 
allelic diversity measures based respectively on a 
single random mating Merino population and on 
a collection of diverse sheep breeds (see text). 
Markers with missing allele information or PIC 
values etc. have not yet been fully characterized. 


Primer 

Locus 

Clone(s) 

Chr 

Map 

Alleles 

PIC-1 

PIC-2 

CSAP001 

ENAH 

SSFP001M11 

12 

41.4 

8 

0.603 

0.72 

CSAP002 

PTP4A2 

SSFP004E11 

2 

281.4 

5 

0.249 

0.45 

CSAP004 

NFAT5 

SSFP004I21 

NA 


1 

0 

0 

CSAP005 


SSFP006I06 

NA 


1 

0 

NA0 

CSAP006 


SSFS014E12 

NA 


1 

0 

NA0 

CSAP007 

INSIG2 

SSFS010013 

2 

218.7 

5 

0.689 

0.54 

CSAP008 

LOCI46961 

SSFS045B09 

11 

62.4 

8 

0.390 

0.59 

CSAP009 

CSAP009E 

SSFS061M24 

3 

231.5 

11 

0.734 

0.64 

CSAP010 


SSFS064D20 

NA 

NA 

1 

0 

0 

CSAP011 

LOCI 69499 

SSFS045F01, 

2 

52.8 

11 

0.812 

0.74 



SSFS035M03 






CSAP012 

CSAP012E 

SSFS042B10 

11 

69.1 

9 

0.742 

0.72 

CSAP013 

CSAP013E 

SSFS048F15 

11 

0.87 

6 

- 

0.000 

CSAP014 

ADH5 

SSFS063K12 

6 

49.8 

2 

- 

0.12 

CSAP016 

LOC340481 

SSFS004I17 

2 

110.5 

6 

0.543 

0.43 

CSAP017 

CSAP017E 

SSFS006A19 

3 

186.4 

10 

0.807 

0.69 

CSAP018 

CSAP018E 

SSFS033P24 

1 

214.6 

9 

- 

0.78 

CSAP019 

LUM 

SSFS022A19 

3 

177.8 

8 

0.626 

0.68 

CSAP024 


SSFS037B03 

NA 

NA 

1 

0 

0 

CSAP025 

NUMB 

SSFS063I04 

7 

118 

5 

- 

0.33 

CSAP026 

P434D1335 

SSFP026K11 

14 

72.9 

2 

- 

0.19 

CSAP027 

RBT1 

SSFS039K12 

14 

85.3 

5 

- 

0.60 

CSAP028 

KIF26A 

SSFP003M04 

18 

122.5 

5 

- 

0.62 

CSAP030 

FLJ90119 

SSFAO11062 

21 

29.1 

15 

0.767 

0.79 

CSAP031 

LBH 

ad63 

3 

73.9 

8 

0.707 

- 

CSAP033 

CSAP033E 

fpl 13 

1 

141.4 

12 

- 

0.71 

CSAP034 

CSAP034E 

SSFPO18001 

7 

106.7 

16 

- 

- 

CSAP035 

TPM1 

SSFS031B13 

7 

68.7 

4 

0.546 

0.31 

CSAP036 

CSAP036E 

SSA019O66 

1 

97 

10 

0.751 

0.82 

CSAP037 

CSAP037E 

SSFP007N13 

16 

0.0 

12 

0.822 

0.78 

CSAP038 

KIAA1580 

SSA024I11 

15 

103.2 

12 

0.721 

0.76 

CSAP039 

CSAP039E 

SSA005P09 

3 

268.8 

11 

0.665 

0.75 

CSAP040 

IGF1 

SSFS038J08 

3 

225.6 

7 

0.583 

0.62 


Table 4. Primer set names, primer sequences, PCR product sizes, repeat and PCR conditions are given. 


Primer 

Forward 

Reverse 

Size 

Repeat 

Temp 

Mg 

CSAP001 

TT AGGT GGT AGG A AG AAGG 

T GT AACATACAGGATTGCTC 

163-177 

(GT) 12 

50 

2.0 

CSAP002 

T C AGGC AGGC ACT GTT ACT G 

AGAGGAAGGGGAAGAGAGAG 

119-129 

(GT) 16 

56 

1.5 

CSAP004 

TAGCTAATGAGCCTCAGG 

T AAGGGAACAAGGT GAAG 

198 

(CA) 2 o 

NA 

NA 

CSAP005 

TTCTTTCATTTTACCCTC 

CCT ATTTT CTTT C AC ACTT C 

173 

(GT) 10 

NA 

NA 

CSAP006 

GCACACT GTAGATCCCT CA 

GGATACACAGCTTTAGGTAGAA 

136 

(CA) 18 

NA 

NA 

CSAP007 

GAT GGC ACTT AT CT GAT AC AC 

CATTACAAGAGCAGCACAC 

168-178 

(GT) 9 .. (GT) 12 

56 

1.5 

CSAP008 

CAGGAGCAGAAGT CAGAGCCA 

GT CAG AAC AT CCT CC AGCCC A 

189-225 

(AC) 6 GC(AC) 5 GC(AC) 6 

58 

1.5 

CSAP009 

AGT AGG AAATT GCAACCC ACT C 

GT AGC ATTT CTT GGCTT CC AT AAC 

210-238 

(AC) 20 

56 

1.5 

CSAP010 

CSAP011 

AGTTCT GT GTTT AACC AT ATT ACCGT G 

C AGT CT CT GG AT CT GCTTCT GTAC 

T G AAAGT CAT A AAC ACTT CT GT GC AG 
AGTTTCAAATACCAGGTCACAATTT 

226 

148-182 

(AC) 3 AA(AC) 9 AT(AC) 4 

(CA) 2 i 

58 

1.5 

CSAP012 

C AGT CT GT GT GAG A AGT G AG AAG AG 

C AGT ATT CTT GCCT GG AAATT CT C 

174-190 

(GT) 16 

56 

1.5 

CSAP013 

CSAP014 

CTATGAAAGGAGCTGCTATGTACACAT 

T CAT GGT GAGT AT GCCT GGTTT C 

T AGCCC AACCTTT AT GCAAGCT 

T C AGC A ACC ATT CTT AT CCG AG A 

148-150 

(AC) 16 

(GT) 8 

58 

1.5 

CSAP016 

AGTTTTGCATTTCCTCTTAGACAAG 

CAG AT G A AGGTTT CT GC AT CT GT 

184-197 

(GT) 13 

57 

1.5 

CSAP017 

CAGCTTTAGAATATGAGCCATCTCC 

GT CATT CTTT C ACCCC AGGTAAT A 

211-229 

(GT) 2 o 

57 

1.5 

CSAP018 

T G AA AGGT G A ACCCGTTTCCT GAT 

CAACCACACAGACACTGAAAATACCACT 

240- 

(CA) 17 

58 

1.5 

CSAP019 

CAT GCTTTT GGGC AC ATTAAGTT 

GAGAAGAT GAAAGGCAGGCCT AT 

189-235 

(GT) 2 o 

58 

1.5 

CSAP024 

CSAP025 

CAT GGACTT CT G AGCT GGCT G 

C AGGTT ACCGT GTT GGTTT GT C 

AGTT GAGCCG AGGTT GCT GT C 

T CAGCT C ACC AC AAG AAG AGG 

152 

187-209 

(AC) 5 ATAG(AC) 9 

(GT) 9 (GCGTGT) 2 GC(GTGC) 5 

56 

1.5 

CSAP026 

T C ACT AAATT GT ATTT GGC A ATT CC 

T GACT G AC AAAATT CT CTACC ATCC 

243- 

(GT) 7 AT(GT) 5 (GTCT) 2 

56 

1.5 

CSAP027 

T C ATCCC AGT CTT GGT GGGAT 

CAT GC AAG AC A ATTT GT GG AACC 

137— 

(GT) 14 

56 

1.5 

CSAP028 

CAGGAATTCTTACCAAAACCACAAG 

CAGCACCTTTACAAAACCCTCAC 

97-115 

(AC) 14 

56 

1.5 

CSAP030 

AGTTCAGGCCCCACACCACTAC 

GT CAGAGACAGCTGGCAGAGTAAGA 

175-233 

(GT) 22 

56 

1.5 

CSAP031 

C AGCT CCT CGGGC AAGT GG 

CACGGCAAGACT GGGACAGAT G 

249-265 

(TC) 6 TT(TC) 15 

56 

1.5 

CSAP033 

TCAGATTCTAAAGCCAGAGAACGT 

AGCTCGTT CC A AAT CT A AC A A AAC 

205- 

(GT) 14 

56 

1.5 

CSAP034 

C AGT GC AGCC AAAT GATT AAGT GT 

CAG A A ATTT GCT GGTC AA AT AGC A 

144- 

(GT) 9 GCGTGTGC(GT) 5 

56 

2 

CSAP035 

GT CAACT CGGGAAAT GGCAC 

CAT GT G AA AACCT CCTT AGCT GC 

251-257 

(GT) 10 

56 

1.5 

CSAP036 

T C ATTT GGAT GCTTT AGTTTT CTT G 

T CAGTAGTAAGT GATTT GGCATGG 

215-233 

(AC)i 3 

56 

1.5 

CSAP037 

CAGGCATATTTTGGGGTACGGTT 

C ACCT C AAT GCC AC AT ACT CC AAG 

174-204 

(GT) 17 

56 

1.5 

CSAP038 

CAGCTATTTTAACCACTGGCCATT 

C AGATCTT CC AACT CTAAAT GT GCG 

100-134 

(AC)i 8 

56 

1.5 

CSAP039 

CACCAAAGAAGACAAGGGGAAAC 

C AGC AACT CT GCT AAA AGC A A AC A 

144-160 

(AG) 5 ..(AG) 5 ..(AG) 14 

56 

1.5 

CSAP040 

C AGT ATT G AAA AG AT GATT G AGTT ACT C A 

T C AGT GATT GCTTTT AT GAG AGCT A 

264-284 

(GT) 15 

56 

1.5 
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Among the major agricultural animal species, the sheep 
genome is one of the least well characterized. The current sheep 
linkage map consists of 1,223 loci with only 240 genes mapped 
on it (http://rubens.its.unimelb.edu.au/ — jillm/jill.htm). In ad¬ 
dition, ~ 250 other expressed sequences have been assigned to 
the sheep genome via somatic cell hybrid mapping (SaidiMeh- 
tar et al., 1981; Burkin et al., 1991, 1998; Cockett et al., 2001). 
In this study we have identified -250 transcribed sequences 
that also contain hypervariable dinucleotide repeats and devel¬ 
oped 32 microsatellite markers. 27 of these have been mapped. 
As these transcripts have the potential of both type I and type II 


markers, they will be invaluable in augmenting sheep genome 
mapping efforts. These sequences can be used to cross-refer¬ 
ence the sheep genome to other mammalian genomes and to 
ascertain regions of conserved synteny between the ovine 
genome and human, mouse and cattle genomes. 
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Identification of a novel ovine PrP 
polymorphism and scrapie-resistant genotypes 
for St. Croix White and a related composite 
breed 
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Abstract. Susceptibility to scrapie is primarily controlled by 
polymorphisms in the ovine prion protein gene (PRNP). Here, 
we report a novel ovine exon three PRNP polymorphism (SNP 
G346C; Pn 6 ), its association with the ovine ARQ allele 
(P 116 A 136 R 154 Q 171 ), and two new genotypes (PARQ/ARR; 
PARQ/ARQ) for the St. Croix White (SCW) breed and a relat¬ 
ed composite (CMP) breed developed for meat production. 
The (Pi 16 ) polymorphism occurs between the N-terminal cleav¬ 
age site and the hydrophobic region of the ovine prion protein, 


a region which exhibits extreme conservation across mammali¬ 
an taxa. The relatively high frequency (0.75) of resistant ARR 
alleles and the absence of ARQ alleles for the SCW ewes used as 
breeding stock for CMP resulted in significant genic differenti¬ 
ation (P = 0.0123; S.E. = 0.00113). Additionally, the majority 
of the SCW (66.7%) and CMP (65.4%) sampled possessed 
genotypes considered resistant or nearly resistant to scrapie and 
experimental BSE (bovine spongiform encephalopathy. 

Copyright©2003 S. Karger AG, Basel 


Scrapie is an inevitably fatal transmissible spongiform en¬ 
cephalopathy (TSE) affecting sheep and goats. Polymorphisms 
within exon three of the ovine host-encoded prion protein gene 
(PRNP) at codons 136 (Alanine or Valine; A,V), 154 (Histidine 
or Arginine; H,R), and 171 (Glutamine, Arginine, or Histidine; 
Q, R, or H) are associated with variation in the phenotypic 
expression of scrapie including incubation period, clinical 
signs, and pathology (Bossers et al., 1996, 2000; reviewed 
by Hunter, 1997). Of the twelve possible alleles derivable 
from these polymorphisms, only five are commonly seen: 
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A 136 R 154 R 171 (hereafter ARR), ARQ, VRQ, AHQ, and ARH 
(Belt et al., 1995). It should also be noted that seven additional 
ovine PRNP polymorphisms, exhibiting little or no association 
with the phenotypic expression of scrapie, have been described 
at codons 112, 127, 137, 138, 141, 151, and 211 (as referenced 
in Bossers et al., 2000). High susceptibility to scrapie is associ¬ 
ated with the ovine VRQ allele, while the ARR allele is associ¬ 
ated with resistance (Westaway et al., 1994; Belt et al., 1995; 
Hunter et al., 1996; Sabuncu et al., 2003). The AHQ allele may 
be associated with resistance in some ovine breeds, but not oth¬ 
ers, while the ARH allele is likely to be neutral (Dawson et al., 
1998; Baylis et al., 2002a). 

In the U.S. sheep population scrapie has only been con¬ 
firmed in sheep homozygous for the PRNP allele encoding glu¬ 
tamine at codon 171 (Q/Q), regardless of breed (Westaway et 
al., 1994; O’Rourke et al., 1996, 1997, 2002). Moreover, the 
ovine PRNP genotype ARR/ARR is known to confer global 
resistance to scrapie and experimental BSE (for review of geno¬ 
types see Belt et al., 1995; Hunter, 1997; Baylis et al., 2002a, 
2002b). The ARR/AHQ and ARR/ARQ genotypes are associ¬ 
ated with nearly complete resistance to scrapie worldwide as 
well as incubation periods of >5 years following intracerebral 
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challenge (IC) with BSE (Foster et ah, 2001; Jeffrey et ah, 2001; 
Baylis et al. 2002a, 2002b). The ARQ/ARQ genotype is gener¬ 
ally associated with increased risk of scrapie worldwide al¬ 
though some breeds (e.g. Cheviot Sheep, UK) are relatively 
resistant (Baylis et ah, 2002b). Sheep possessing the ARQ/VRQ 
genotype are at high risk of scrapie and experimental BSE (Bay¬ 
lis et al., 2002b). The ARR/VRQ genotype, somewhat variable 
by breed, is generally associated with rare to slightly elevated 
risk of scrapie as well as incubation periods of > 5 years follow¬ 
ing IC with BSE (Belt et. al., 1995; Foster et al., 2001; Jeffrey et 
al., 2001; Baylis et al., 2002a, 2002b). 

The St. Croix White (SCW) and White Dorper (WD) breeds 
are members of a larger group of sheep commonly referred to as 
hair sheep or hair breeds. Collectively, hair breeds make up a 
relatively small portion of the overall world sheep population 
and as a result have often escaped studies pertaining to scrapie 
or the PRNP locus in general, yet they are the predominant 
breed type found throughout the Caribbean and other tropical 
regions (Shelton, 1991; Godfrey and Collins, 1999). Addition¬ 
ally, hair breeds are commonly utilized in tropical regions 
worldwide for meat production and are valued for their resis¬ 
tance to Trichostrongyle (Mazzola, 1990; Godfrey and Collins 
1999). In this study we investigated exon three PRNP geno¬ 
types and allelic variants for SCW as well as a related composite 
breed (CMP) developed for commercial meat production. 

Materials and methods 

Study animals 

A total sampling of 33 sheep from Dorpcroix Sheep Farm (Hermleigh, 
TX USA) consisted of the following: six unrelated adult SCW (ewes) pre¬ 
viously utilized as breeding stock for CMP, one full-blooded adult WD (ram) 
utilized as breeding stock for CMP, and 26 CMP (20 adult ewes, three ewe- 
lambs, and three adult rams). Composite animals (26 of n = 500 total farm) 
were developed for commercial meat production in 1998 and represent a 
synthetic breed resulting from an initial cross (SCW ewes * WD rams) fol¬ 
lowed by selection and crossing of animals exhibiting economically impor¬ 
tant traits such as overall hardiness and robust body stature. The WD and 
SCW sampled do not represent the sole founding stock for CMP. Study ani¬ 
mals had no previous history or symptoms of scrapie at the time of publica¬ 
tion. 

DNA isolation and PRNP amplification 

Genomic DNA was isolated from whole blood samples either by spotting 
whole blood on Whatman Bioscience FT A® Classic Cards, and following the 
preparation protocol provided by the manufacturer (Whatman Inc., Clifton, 
NJ), or through utilization of the SUPER QUICK-GENE DNA Isolation kit 
(Analytical Genetic Testing Center, Denver, CO). 

The entire coding region for exon three of the ovine PRNP gene was 
amplified via PCR with the flanking synthetic oligonucleotides SAFI and 
SAF2 (Prusiner et al., 1993). Thermal cycling parameters, as optimized in 
our laboratory, were as follows: 2 min at 96 ° C; 4 cycles * 30 s at 96 0 C, 30 s 
at 58 0 C (-1 °C/cycle), 90 s at 65 0 C; 31 cycles x 30 s at 96 ° C, 30 s at 54 °C, 
90 s at 65° C; 15 min at 65 °C. Each 25-pl reaction included a 1.2-mm FTA 
punch or 100 ng genomic DNA, 400 pM dNTPs, 2.0 mM MgCE, 0.28 pM 
each primer, lx reaction buffer, lx MasterAmp™ PCR Enhancer (Epi¬ 
centre, Madison, WI) and 1.0 unit Taq polymerase (Promega). PCR products 
were examined through agarose gel electrophoresis and purified using a Qia- 
gen QIAquick PCR Purification Kit (Qiagen, Valencia, CA). 

Sequencing 

Purified PCR products were directly sequenced using a Big Dye™ Ter¬ 
minator Cycle Sequencing kit (Applied Biosystems, Foster City, CA), the 
aforementioned PCR primers, and the following thermal parameters: 2 min 


Table 1. PRNP allele frequencies and observed genotype frequencies 


Breed 3 

PRNP b 

allele 

Total 

Frequency Genotype (obs.) 

Total 

Frequency 

% 

WD 

ARQ 

2 

1.0000 

ARQ/ARQ 

1 

100 

WD Sum 


2 

1.0000 


1 

100 

SCW 

ARR 

9 

0.7500 

ARR/ARR 

3 

50.0 


ARQ 

0 

0.0000 

P n6 ARQ/ARR 

1 

16.7 


VRQ 

1 

0.0833 

ARR/AHQ 

1 

16.7 


AHQ 

1 

0.0833 

ARR/VRQ 

1 

16.7 


ARH 

0 

0.0000 





P n6 ARQ c 

1 

0.0833 




SCW Sum 


12 

1.0000 


6 

100 

CMP 

ARR 

22 

0.4231 

ARR/ARR 

4 

15.4 


ARQ 

23 

0.4423 

ARR/ARQ 

12 

46.2 


VRQ 

2 

0.0385 

Pn 6 ARQ/ARQ 

4 

15.4 


AHQ 

1 

0.0192 

ARQ/ARQ 

3 

11.5 


ARH 

0 

0.0000 

ARR/AHQ 

1 

3.8 


P, 16 ARQ C 

4 

0.0769 

ARR/VRQ 

1 

3.8 





ARQ/VRQ 

1 

3.8 

CMP Sum 


52 

1.0000 


26 

100 

3 WD, White Dorper; SCW, St. Croix White; CMP, composite breed. 


Ovine PRNP exon 3. 






c Alanine is 

the wild-type 

amino 

acid at ovine position 116. 




at 96°C; 35 cycles x 30 s at 96°C, 20 s at 54°C, 4 min at 60°C; 5 min at 
60 ° C. Each 10-pl sequencing reaction included: 60 ng purified PCR product, 
2 pi Big Dye™, 0.8 pM primer and 0.5 x MasterAmp™ PCR Enhancer. 
Reactions were purified with G-50 sephadex columns (Biomax, Odenton, 
MD). Sequence fragments were separated and analyzed using an ABI 3100 
Genetic Analyzer (Applied Biosystems, Foster City, CA), and are available 
through GenBank accession no. (AY350241-AY350275). 

Validation techniques 

Most samples were directly sequenced more than once. Representative 
alleles from each genotypic class with more than one single nucleotide poly¬ 
morphism (SNP) were validated through cloning using a TOPO TA Cloning 
kit according to the manufacturer’s recommendations (Invitrogen, Carlsbad 
CA). Plasmid DNA was isolated and purified using a Qiagen Plasmid Mini 
Kit as directed by the manufacturer (Qiagen Inc., Valencia CA). Insert 
sequencing for 12 clones was carried out via the sequencing method pre¬ 
viously described with the following exceptions: 400 ng/reaction plasmid 
DNA, 50°C anneal temperature, and (6.2 pmol/reaction) Ml3 forward and 
reverse primers. 

Computer software and analysis 

Ovine PRNP exon three genotypes and allelic variants were visualized 
using ABI PRISM SeqScape SNP Discovery and Validation Software ver¬ 
sion 1.01 (Applied Biosystems, Foster City, CA). Allele frequencies and tests 
of genic differentiation were calculated in GENEPOP (Raymond and Rous- 
set, 1995). 


Results and discussion 

The frequencies of the five most common ovine PRNP exon 
three alleles (ARR, ARQ, VRQ, AHQ, and ARH), as verified 
via cloning, as well as a new allele (P 116 A 136 R 154 Q 171 ; hereafter 
PARQ) for SCW and CMP are presented in Table 1. The novel 
(Pi 16 ) polymorphism associated with the PARQ allele is the 
result of an SNP (G-^C) at ovine nucleotide position 346. The 
origin of the PARQ allele is likely SCW since the allele is 
present in SCW and CMP, but not in the WD sampled (Ta- 
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Tree shrew* 
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Mole* 
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Fruit bat* 
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Leaf-nosed bat* 
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Pangolin* 

Horse* 

Black rhino* 
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Sheep (wt)t 
Sheep (SCW) 

Sheep (CMP) 
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Asian elephant* 
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Fig-1 . Extreme conservation associated with the mammalian prion pro¬ 
tein regions flanking the hydrophobic region, the hydrophobic region itself, 
AA position 116, and the N-terminal cleavage site. Proximally relevant 
human mutations also depicted (GSS mutations: P102L, P105L, and 
A117V; Ml29V associated with vCJD). Asterisk (*) indicates sequences gen¬ 
erated by van Rheede et al. (2003). Sequences obtained from GenBank, as 
referenced and utilized in van Rheede et al. (2003), are indicated by a cross 

(t). 


sampled are depicted in Table 1. Absence of the ARQ allele 
combined with the relatively high frequency of the ARR allele 
in the SCW sampled results in significant (P = 0.0123; S.E. = 
0.00113) genic differentiation between SCW and CMP. 

The relationship between scrapie susceptibility or resis¬ 
tance, the novel PARQ allele, and/or the associated genotypes 
(PARQ/ARR; PARQ/ARQ), is presently unknown. However, 
the proline polymorphism noted at ovine amino acid position 
116 occurs between the N-terminal cleavage site (between 
Lysn 2 and Hisin; human numbering) and the hydrophobic 
region of the prion protein, a region exhibiting extreme conser¬ 
vation across mammalian groups (Fig. 1). The functional abili¬ 
ty of the normal cellular prion protein (PrP c ) as a potential cell- 
surface receptor is most likely modulated by the proteolytic 
cleavage and removal of the N-terminal region of the protein 
(Harris et al., 1993; van Rheede et al., 2003). Furthermore, the 
amino acid residues immediately flanking the ovine (Pi i6) poly¬ 
morphism are considered to play a major role in the interface 
between (PrP c ) and the pathogenic isoform (PrP Sc ) (Cohen and 
Prusiner, 1998). Currently, three pathogenic human mutations 
causing GSS (Gerstmann-Strausler-Scheinker syndrome; 
P102F; P105F; and Al 17V) and one human mutation strongly 
associated with the phenotypic expression of vCJD (variant 
Creutzfeld-Jakob disease, Ml29V) have been described within 
the regions of the prion protein immediately flanking the ovine 
(Pi 16 ) polymorphism (Fig. 1; for review see Collinge, 2001 and 
van Rheede et al., 2003). 

In conclusion, we have demonstrated that resistant or nearly 
resistant genotypes exist for the majority of the CMP (65.4%) 
and SCW (66.7 %) sampled, while the WD ram was determined 
to possess a susceptible genotype (ARQ/ARQ) (Table 1). Addi¬ 
tionally, the identification of a novel ovine PrP polymorphism 
provides an opportunity for future challenge experiments to 
investigate the potential effect(s) of the PARQ allele as well as 
the PARQ/ARR and PARQ/ARQ genotypes. 
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ble 1). We did not detect the ARH allele in our SCW, WD, and 
CMP samples (Table 1). Absence of the ARH allele has pre¬ 
viously been reported in Scottish Blackface, Welsh Mountain, 
Swaledale, and Beulah breeds in the UK (Arnold et al., 2002). 
Of the 15 possible genotypes derivable from the five most com¬ 
mon ovine PRNP exon three alleles we determined our samples 
for WD, SCW, and CMP to possess only six, collectively (Ta¬ 
ble 1). However, two new PRNP genotypes (PARQ/ARR and 
PARQ/ARQ) were detected for the SCW and CMP samples, 
thereby increasing the total number of distinct PRNP geno¬ 
types detected in this study to eight (Table 1). The distribution 
of PRNP exon three genotypes within the SCW, CMP, and WD 
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Abstract. 1,144 sheep belonging to 21 breeds and known 
crosses were sequence analyzed for polymorphisms in the ovine 
PRNP gene. Genotype and allele frequencies of polymor¬ 
phisms in PRNP known to confer resistance to scrapie, a fatal 
neurodegenerative disease of sheep, are reported. Known poly¬ 
morphisms at codons 136 (A/V), 154 (H/R) and 171 (Q/R/H/ 
K) were identified. The frequency of thel71R allele known to 


confer resistance to type C scrapie was 53.8% and the frequen¬ 
cy of the 136A allele known to influence the resistance to type A 
scrapie was 96.01 %. In addition, we report the identification of 
five new polymorphisms at codons 143 (H/R), 167 (R/S), 180 
(H/Y), 195 (T/S) and 196 (T/S). We also report the identifica¬ 
tion of a novel allele (S/R) at codon 138. 

Copyright©2003 S. Karger AG, Basel 


Scrapie, a fatal neurodegenerative disease of sheep and 
goats, is a member of the mammalian transmissible spongiform 
encephalopathy (TSE) disease family. Other members of this 
family include, Creutzfeldt-Jakob disease (CJD) in humans, 
bovine spongiform encephalopathy (BSE), transmissible mink 
encephalopathy (TME), chronic wasting disease (CWD) in deer 
and elk and feline spongiform encephalopathy (FSE) (Prusiner, 
1998). The causative agent for scrapie is believed to be a pro- 
tease-resistant isoform of sheep prion (PrP sc ), which is derived 
from an endogenous, protease-sensitive precursor (PrP c ) (Pru¬ 
siner, 1982, 1996). Polymorphisms within the prion gene 
(PRNP) are associated with susceptibility/resistance to TSEs in 
sheep (Goldmann et al., 1990; Hunter et al., 1997, 2000), goats 
(Goldmann et al., 1996) and in humans (Palmer et al., 1991; 
Mead et al., 2003). Ten polymorphisms in the sheep PRNP 
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gene have been reported to occur in codons 112, 136, 127, 137, 
138, 141, 151, 154, 171 and 211 (Goldmann et al., 1990; 
Laplanche et al., 1993; Belt et al., 1995; Hunter et al., 1996; 
Ishiguro et al., 1998; Bossers et al., 1999; Gombojav et al., 
2003). Of these, only three codons (codons 136, 154 and 171) 
have a reported influence in the susceptibility to the disease. 
Codon 171 influences the susceptibility to scrapie in Suffolk 
sheep under natural conditions and in Cheviot sheep when 
experimentally induced (Westaway et al., 1994; Goldmann et 
al., 1994a). In some breeds, scrapie susceptibility was signifi¬ 
cantly enhanced by an alanine (A) being substituted by a valine 
(V) at codon 136 (Hunter et al., 1994). However, although the 
influence of codon 154 on scrapie is unclear, there is evidence 
suggesting that a histidine substitution at codon 154 enhances 
the resistance to scrapie in some sheep breeds (Elsen et al., 
1999; Thorgeirsdottir et al., 1999). 

In the Suffolk breed, valine (V) at codon 136 is reported to 
be extremely rare or absent (Clouscard et al., 1995), and resis¬ 
tance to scrapie infection is thought to be influenced primarily 
by polymorphisms in codon 171 (Westaway et al., 1994; 
O’Rourke et al., 1997). Animals with an allele carrying arginine 
(R) at codonl71 usually are resistant to infection. However, 
this association has not been absolute. Animals with a 171QR 
genotype have been shown positive for scrapie in Scotland 
(Hunter et al., 1997), one 171RR animal was diagnosed with 
scrapie in Japan (Ikeda et al., 1995) and more recently, a 
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171QR Suffolk was diagnosed with scrapie in the US (Walsh, 
2002 ). 

Susceptibility to ovine scrapie also is determined by the 
infective scrapie strain (Goldmann et al., 1994a; O’Rourke et 
al., 1997). Two strains of scrapie have been defined based on 
their infectivity in Cheviot sheep of distinct PRNP genotypes. 
A challenge with isolate SSBP/1, the prototype for scrapie 
strain A, produces disease in Cheviot sheep that either are 
homozygous or heterozygous for valine at codon 136 (Gold¬ 
mann et al., 1994b). In contrast, strain C, causes disease in 
sheep that are homozygous for glutamine at codon 171 (Gold¬ 
mann et al., 1994a). 

Though recognized as a disease of sheep in Europe more 
than 250 years ago (Dawson et al., 1998), scrapie in the United 
States was first diagnosed in 1947 (Kahler, 2002). In the United 
States, the Suffolk breed accounts for 86.4% of all scrapie cases 
(Westaway et al., 1994). A variety of breeds account for the rest 
(http://www.animalagriculture.org/scrapie/Media/FactSheet. 
htm). From the time it was first diagnosed in 1947 to August 
2001, approximately 1,600 sheep and seven goats have been 
diagnosed with scrapie in the United States and the annual loss 
to the sheep industry in the US is estimated at US $20 million 
per year (Kahler, 2002). 

We have sequence analyzed a fragment of the PRNP gene 
from approximately 1,200 sheep belonging to 21 breeds and 
known crosses from flocks in the state of Oklahoma. This 
report describes the allelic variants, allele and genotype fre¬ 
quencies and breed distribution of allelic variants at codons 
136, 154 and 171. We also report the identification of less prev¬ 
alent, but previously reported polymorphisms at codons 137, 
138, and 141 as well as previously unreported polymorphisms 
at codons 143, 167, 180, 195 and 196. As far as we are aware, 
this is the largest survey of PRNP polymorphisms reported. 

Materials and methods 

Animals and sample collection 

1,144 sheep from flocks across Oklahoma were included in this study. 
Owners of these animals participated in a voluntary scrapie certification pro¬ 
gram. Owners were instructed to collect a few drops of blood in an Isocode 
Stix DNA isolation matrix (Schleicher and Schuell, Dassel, Germany), dry at 
room temperature, and mail to the laboratory. Samples were collected during 
2002-2003. 

DNA extraction and PCR amplification 

A single triangular area of the Isocode (DNA isolation matrix was sepa¬ 
rated in to a 96-well PCR plate using a sterile pair of forceps. Samples were 
washed in 150 pi of dH 2 0 by vortexing three times for 15 s each. The wash 
water was discarded by aspiration and 100 pi of fresh dH 2 0 added to the 
tubes. DNA was eluted by heating the samples to 95 0 C for 30 min in a PCR 
machine and centrifuging at 3,000 rpm for 30 s. 40 pi of the resultant isolate 
was used for further analysis. 

PCR primers that amplify a 421-bp fragment of the ovine PRNP gene 
(GenBank accession no. X79912) were developed using Primer3 software 
(Rozen and Skaletsky, 2000). Primer sequences were: 

PRNPF: 5 -CAA GCC CAG TAA GCC AAA AA-3' 

PRNPR: 5-CAC AGG AGG GGA AGA AAA GAG-3' 

PCR reactions were performed in an MJ Dyad thermal cycler (MJ 
Research Inc, Waltham, MA, USA). 50 pi reaction mix consisted of 39.4 pi 
DNA, lx PCR buffer (Applied Biosystems, Foster City, CA, USA), 1.5 mM 
MgCl 2 , 25 pmol of each primer, 200 pm dNTPs, and 3 U Taq DNA Poly¬ 
merase (Promega, Madison, WI, USA). Cycling conditions were one cycle of 


3 min at 95 0 C, 30 s at 60 0 C, 40 s at 72 °C, followed by 29 additional cycles 
of 30 s at 94 0 C, 30 s at 60 ° C, 40 s at 72 0 C and a final extension of 3 min at 
72 °C. 

PCR cleanup and sequencing 

PCR products in a 96-well format were treated with shrimp alkaline 
phosphatase (1 unit/pl) and Exonuclease III (10 units/pl) (Amersham Bio¬ 
sciences, Piscataway, NJ, USA) using a volume equaling 5% of the PCR 
product, incubated at 37 °C for 30 min followed by 80 °C for 10 min. 1-5 pi 
of the resulting PCR product was used in sequencing reactions without fur¬ 
ther cleanup. A single nested primer, 5-CCA GTA AGC CAA AAA CCA 
ACA-3', was used in all sequencing reactions. PCR products from four 96- 
well plates were arrayed in a 384-well format (Cycleplate 384, Robbins Scien¬ 
tific) and sequencing was performed in an 8-pl reaction volume containing 
1-5 pi PCR product, 6.5 pmol sequencing primer, 0.5 pi ET dye-terminator 
mix (Amersham) and 1 pi buffer (400 mMTris-HCl, pH 9.0, 10 mM MgCl 2 ). 
Sequencing reactions were performed in a Perkin-Elmer 9700 thermocycler 
(Applied Biosystems, Foster City, CA, USA) using the following conditions: 
60 cycles at 95 0 C for 30 s, 50 ° C for 20 s and 60 °C for 4 min, followed by a 
4°C hold. Sequence reactions were ethanol precipitated, dissolved in water 
and analyzed on an ABI 3700 DNA analyzer (Applied Biosystems). Data 
were collected using the ABI software and transferred to a SUN workstation 
for further analysis. All sequences were base-called and analyzed using the 
Phred/Phrap/Consed suit of programs (Ewing et al., 1998; Ewing and Green, 
1998; Gordon et al., 1998). PolyPhred (version 4) (Nickerson et al., 1997) 
was used to identify single nucleotide polymorphisms (SNPs) in the assem¬ 
bled sequences. The PolyPhred output was reformatted and arrayed using an 
in-house script, report_polyphred.pl (http://www.genome.ou.edu/informa- 
tics.html). 

All data were manually checked for accuracy at codons where polymor¬ 
phisms were noted. Any ambiguous codon identifications were discarded 
from the final analysis. In sequence chromatograms, data towards the start of 
the read is less distinct than that towards the middle. This fact accounts for 
the discrepancy for allele numbers reported for codon 136 and codon 171. As 
this seems to be an entirely random process, validity of data was not compro¬ 
mised. 


Results and discussion 

The initial objective of this study was to develop a robust, 
cost-effective and scalable protocol to identify the codon 171 
polymorphisms of sheep in Oklahoma, as part of a nationwide 
scrapie eradication program. PRNP allelic variants for 1,144 
sheep are reported here and represent one of the largest surveys 
of PRNP genotypes reported. The oldest animal reported was 6 
years old and there were several newborn lambs. All samples 
were collected during 2002-2003. Male to female ratio was 
1:15.5. These animals are representative of the average breed 
distribution of sheep found in Oklahoma. Suffolk with 232 ani¬ 
mals was the most numerous breed represented followed by 
Hampshire (n = 106), Dorset (n = 88) and Montadale (n = 54). 
There were a total of 21 breeds and known crosses with most of 
them representing less than ten animals per breed. They were 
pooled as meat type crosses, wool type crosses and hair breeds 
for ease of analysis. Meat type crosses included Oxford, Suffolk, 
Dorset, Montadale and Hampshire crosses. Wool type crosses 
were Romney, Merino, Rambouillet and their crosses. Hair 
types were St. Croix, Dorper, Katahdin, Barbados and their 
crosses. 

Codon 171 

We observed four alleles and six different genotypes at 
codon 171 (Table 1). Q, R and H alleles are well known. We 
recently reported the identification of the novel lysine (K) allele 
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Table 1. PRNP genotypes and allele frequencies at codon 171 in Oklahoma sheep 


Breed Total Codon 171 genotype Allele frequency 




QQ 

% 

QR 

% 

RR 

% 

KQ 

% 

KR 

% 

HQ 

% 

Q% 

R% 

K% 

H% 

Suffolk 

232 

44 

18.97 

102 

43.97 

86 

37.07 


0.00 





40.95 

59.05 

0.00 

0.00 

Hampshire 

106 

27 

25.47 

56 

52.83 

23 

21.70 


0.00 





51.89 

48.11 

0.00 

0.00 

Dorset 

88 

40 

45.45 

41 

46.59 

7 

7.95 


0.00 





68.75 

31.25 

0.00 

0.00 

Montadale 

54 

22 

40.74 

24 

44.44 

8 

14.81 


0.00 





62.96 

37.04 

0.00 

0.00 

Meat type crosses 3 

170 

32 

18.82 

86 

50.59 

49 

28.82 

3 

1.76 





45.00 

54.12 

0.88 

0.00 

Wool type crosses 3 

26 

4 

15.38 

12 

46.15 

10 

38.46 


0.00 





38.46 

61.54 

0.00 

0.00 

Hair breeds 3 

77 

26 

33.77 

37 

48.05 

8 

10.39 

4 

5.19 

1 


1 


61.04 

35.06 

3.25 

0.65 

Unknown 

391 

44 

11.25 

204 

52.17 

143 

36.57 


0.00 





37.34 

62.66 

0.00 

0.00 

Total 

1144 

239 

20.89 

562 

49.13 

334 

29.20 

7 

0.61 

1 


1 

0.09 

45.80 

wmm 

0.35 

0.04 


See text for breeds pooled in these categories. 


Table 2. PRNP genotypes and allele frequen- 

Cies at codon 136 in Oklahoma sheep Breed Total Codon 136 genotype _ Allele frequency 

AA % AV % VV % A% V% 


Suffolk 

132 

130 

98.48 

2 

1.52 


0.00 

99.24 

0.76 

Hampshire 

92 

92 

100.00 


0.00 


0.00 

100.00 

0.00 

Dorset 

63 

55 

87.30 

6 

9.52 

2 

3.17 

92.06 

7.94 

Montadale 

47 

28 

59.57 

17 

36.17 

2 

4.26 

77.66 

22.34 

Meat type crosses 3 

115 

114 

99.13 

1 

0.87 


0.00 

99.57 

0.43 

Wool type crosses 3 

16 

16 

100.00 


0.00 


0.00 

100.00 

0.00 

Hair breeds 3 

67 

67 

100.00 


0.00 


0.00 

100.00 

0.00 

Unknown 

233 

216 

92.70 

7 

3.00 

10 

4.29 

94.21 

5.79 

Total 

765 

718 

93.86 

33 

4.31 

14 

1.83 

96.01 

3.99 


See text for breeds pooled in these categories 


in codon 171 in eight animals (Guo et al., 2003). QQ, QR and 
RR genotype frequencies were 20.89, 49.13, and 29.20% 
respectively. According to this data 79.11 % of Oklahoma sheep 
are resistant to infection by type C scrapie. Allele frequencies 
for Q, R, H, and K were 45.8, 53.8,0.04, and 0.35% respective¬ 
ly. Of the studied major breeds, the R allele frequency was high¬ 
est in Suffolk with 59.05% and lowest in Dorset at 31.25%. 
Intriguingly, out of the 2,288 alleles analyzed, we only observed 
one Histidine (H) containing allele in a St. Croix-Dorper ewe. 
The lysine-171 allele was seen in eight animals with seven hav¬ 
ing a KQ genotype and one KR. Five of these animals were hair 
breeds (Dorper = 1, Barbados Blackbelly = 2 and Barbados-St. 
Croix = 2). Gene frequency for K in Barbados-Blackbelly and 
Barbados-St Croix both were 33.3 %, but this should be consid¬ 
ered with caution as we only tested three animals in each of 
these two breeds (Guo et al., 2003). We also observed three ani¬ 
mals with KQ-171 genotype in Suffolk crosses. Pedigree rec¬ 
ords for several generations were checked in that flock for evi¬ 
dence of Barbados, St. Croix, or any other hair sheep influx in 
to the flock but this proved negative. Recently, a single Kalkh 
sheep from Mongolia was reported to carry a Lysine allele at 
codon 171 (Gombojav et al., 2003). The effect of a lysine amino 
acid at codon 171 in resistance to type C scrapie remains 
unknown. 


R allele frequency and the desirable QR and RR genotype 
frequencies observed in this study are much higher than that 
reported previously for U.S. sheep, particularly the Suffolk 
breed (Westaway et al., 1994; O’Rourke et al., 1996; Stephens 
et al., 1998). This likely is due to the aggressive selection for the 
R-171 allele among Oklahoma sheep breeders over the last five 
years. At the present time, only 20.89% of the animals studied 
seem to be at risk for type C scrapie. 

Codon 136 

We observed the known polymorphisms of alanine (A) and 
valine (V) at codon 136. Codon 136 data were analyzed using 
information from 765 animals. AA, AV and VV genotype fre¬ 
quencies across all animals tested were 93.86, 4.31 and 1.83% 
respectively (Table 2). Frequency of the alanine allele was 
96.01 % and that of valine was 3.99 %. These data are consistent 
with the observation that V-136 is rare in U.S. sheep popula¬ 
tions (O’Rourke et al., 1996; Stephens et al., 1998). Our data 
are also consistent with the observation that V-136 is extremely 
rare in Suffolk sheep (Clouscard et al., 1995). Out of 264 Suf¬ 
folk alleles tested, we identified two V-136 alleles (0.76%). Sur¬ 
prisingly, V-136 allele frequency was significantly higher in 
Montadale sheep (22.34%). However, this observation could 
be biased as most Montadale sheep in the current study (47) 
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Table 3. PRNP genotypes and allele frequen- 

, j Al , , , Breed Total Codon 154 genotype Allele frequency 

cies at codon 154 in Oklahoma sheep ________ 

RR % RH % HH % R% H% 


Suffolk 

233 

232 

99.57 

1 

0.43 

0.00 

99.79 

0.21 

Hampshire 

105 

105 

100.00 


0.00 

0.00 

100.00 

0.00 

Dorset 

92 

90 

97.83 

2 

2.17 

0.00 

98.91 

1.09 

Montadale 

54 

54 

100.00 


0.00 

0.00 

100.00 

0.00 

Meat type crosses 3 

168 

168 

100.00 


0.00 

0.00 

100.00 

0.00 

Wool type crosses 3 

21 

20 

95.24 

1 

4.76 

0.00 

97.62 

2.38 

Hair breeds 3 

76 

75 

98.68 

1 

1.32 

0.00 

99.34 

0.66 

Unknown 

389 

383 

98.46 

5 

1.29 1 

0.26 

99.10 

0.90 

Total 

1138 

1127 

99.03 

10 

0.88 1 

0.09 

99.47 

0.53 


See text for breeds pooled in these categories 


Table 4. PRNP genotype distributions for codon 136, 154, 171 


Genotype 

Wool type 
crosses 3 

Meat type 
crosses 3 


Hair breeds 3 

Suffolk 


Montadale 

Hampshire 

Dorset 


Unknown 


Number 

% 

Number 

% 

Number 

% 

Number 

% 

Number 

% 

Number 

% 

Number 

% 

Number 

% 

VVHRQQ 


0.00 


0.00 


0.00 


0.00 


0.00 


0.00 

1 

1.61 

1 

0.43 

VVRRQQ 


0.00 


0.00 


0.00 


0.00 

2 

4.26 


0.00 

1 

1.61 

6 

2.58 

VVRRQR 


0.00 


0.00 


0.00 


0.00 


0.00 


0.00 


0.00 

3 

1.29 

AVRRQQ 


0.00 


0.00 


0.00 

1 

0.78 

9 

19.15 


0.00 

1 

1.61 

2 

0.86 

AVRRQR 


0.00 

1 

0.88 


0.00 

1 

0.78 

8 

17.02 


0.00 

5 

8.06 

5 

2.15 

AAHHQQ 


0.00 


0.00 


0.00 


0.00 


0.00 


0.00 


0.00 

1 

0.43 

AAHRQQ 

1 

6.25 


0.00 


0.00 


0.00 


0.00 


0.00 


0.00 


0.00 

AAHRQR 


0.00 


0.00 

1 

1.59 

1 

0.78 


0.00 


0.00 


0.00 


0.00 

AAHRRR 

1 

6.25 


0.00 


0.00 


0.00 


0.00 


0.00 


0.00 


0.00 

AARRKQ 


0.00 

3 

2.63 

4 

6.35 


0.00 


0.00 


0.00 


0.00 


0.00 

AARRKR 


0.00 


0.00 

1 

1.59 


0.00 


0.00 


0.00 


0.00 


0.00 

AARRQH 


0.00 


0.00 

1 

1.59 


0.00 


0.00 


0.00 


0.00 


0.00 

AARRQQ 

2 

12.50 

18 

15.79 

22 

34.92 

22 

17.19 

9 

19.15 

24 

26.37 

24 

38.71 

15 

6.44 

AARRQR 

8 

50.00 

59 

51.75 

28 

44.44 

56 

43.75 

11 

23.40 

48 

52.75 

24 

38.71 

114 

48.93 

AARRRR 

4 

25.00 

33 

28.95 

6 

9.52 

47 

36.72 

8 

17.02 

19 

20.88 

6 

9.68 

86 

36.91 

Total 

16 


114 


63 


128 


47 


91 


62 


233 


3 See text for breeds pooled 

in these categories 














came from one flock. The Dorset breed too had a higher fre¬ 
quency of the V-136 allele at 7.94%. 

Codon 154 

We observed known polymorphisms of arginine (R) and his¬ 
tidine (H) at codon 154 (Table 3). RR, RH and HH genotypes 
across all breeds were 99.03, 0.88, and 0.09% respectively. 
These data were consistent with observations in other studies. 

Combined codon 136, 154 and 171 polymorphisms are 
shown in Table 4. As expected, the most common genotypes 
across all breeds were; AARRQR, AARRQQ and AARRRR. 
We observed three crossbred animals with the very rare geno¬ 
type of VVRRQR. This indicates that all three animals had one 
VRR allele with a valine at codon 136 and an arginine at codon 
171. All three animals were crossbred. VRR alleles, although 
previously identified in Germany (Kutzer et al., 2002), are con¬ 
sidered extremely rare so that some commercial PrP genotyp- 
ing companies advise owners of sheep with an RR genotype at 


codon 171 not to test for codon 136 polymorphisms. However, 
the presence of this allele combination would be of significance 
if type A scrapie is present in the United States as alleged 
recently (Walsh, 2002). 

Other polymorphisms 

Eight additional, relatively uncommon, polymorphisms 
were identified in this study (Table 5). Two of these polymor¬ 
phisms (codons 137 and 138) have been described previously. 
We identified a novel allele at a third known polymorphic 
codon (codon 138). Five remaining polymorphic codons have 
not been described previously. A T-> C nucleotide substitution 
resulting in a methionine to a threonine amino acid substitu¬ 
tion was detected at codon 137 in 34 animals. This substitution 
has been reported previously (Bossers et al., 1997; Gombojav et 
al., 2003). Affected animals ranged from Katahdin and Dorper, 
to Suffolk and Hampshire. One Katahdin sheep was homozy¬ 
gous for T allele at codon 137. A known polymorphism was 


92 


Cytogenet Genome Res 102:89-94 (2003) 








Table 5. Secondary polymorphisms in PRNP 
gene identified in the study 


Codon 

Substitution 

Number of animals 

Status 



Homozygotes 

Heterozygotes 


137 

Mh>T 

1 

32 

Previously known 

138 

S—>R 


15 

New Allele of a known polymorphic locus 

141 

L—>F 


23 

Previously known 

143 

H->R 

6 

48 

New Polymorphism 

167 

R—>S 

1 

14 

New Polymorphism 

180 

H—>Y 


2 

New Polymorphism 

195 

T—>S 


4 

New Polymorphism 

196 

T—>S 


2 

New Polymorphism 


recorded at codon 141 (L->F) in 23 animals. We detected 15 
animals with a novel serine (S) to arginine (R) substitution at 
codon 138 (Dorset, Suffolk and crosses). A different polymor¬ 
phism at codon 138, with serine being substituted by aspara¬ 
gine (S -> N), was reported from both Norway and Iceland (Tra- 
nulis et al., 1999; Thorgeirsdottir et al., 2002). 54 animals had a 
previously unreported polymorphism (H->R) at codon 143. 
Six animals were homozygous RR at the allele and the rest were 
heterozygous. 23 of these animals were from a single Montad- 
ale flock with an RR ram and the rest were Dorset, Suffolk and 
crossbreds. Yet another novel polymorphism, an arginine to 
serine substitution, was observed at codon 167 in 15 animals. 


Finally, three other previously undescribed polymorphisms at 
codon 180 (H->Y), 195 (T-*S), and 196 (T-*S) were also 
identified. In total we identified five novel, polymorphisms and 
a new allele on yet another codon during the course of this 
study. The effect of these polymorphisms in scrapie pathogene¬ 
sis remains unknown. 
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From a sow's ear to a silk purse: real progress in 
porcine genomics 
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Abstract. An incredible amount of progress has occurred in 
the past decade since the pig genome map began to develop. 
The porcine genetic linkage map now has nearly 5,000 loci 
including several hundred genes, microsatellites and amplified 
fragment length polymorphism (AFLP) markers being added to 
the map. Thanks to somatic cell hybrid panels and then radia¬ 
tion hybrid panels the physical genetic map is also growing rap¬ 
idly and now has over 4,000 genes and markers. Many quanti¬ 
tative trait loci (QTL) scans have been completed and together 
with candidate gene analyses have identified important chro¬ 
mosomal regions and individual genes associated with traits of 
economic interests. Using marker assisted selection (MAS) the 


commercial pig industry is actively using this information and 
traditional performance information to improve pig produc¬ 
tion. Large scale pig arrays are just now beginning to be used 
and co-expression of thousands of genes is now advancing our 
understanding of gene function. The pig’s role in xenotrans¬ 
plantation and biomedical research makes the study of its 
genome important for the study of human disease. Sequencing 
of the pig genome appears on the near horizon. This commenta¬ 
ry will discuss recent advances in pig genomics, directions for 
future research and the implications to both the pig industry 
and human health. 

Copyright©2003 S. Karger AG, Basel 


The pig is among one of the world’s most important live¬ 
stock and represented by nearly 500 breeds world wide. It was 
likely one of the first animals domesticated over 7,000 years 
ago and pork is now the major red meat consumed (43%) 
worldwide (Rothschild and Ruvinsky, 1998). Furthermore, the 
pig has served as an important model system for human health 
and represents a significant source of tissues and heart valves 
and in the future perhaps organs for transplantation. Efforts to 
unravel the pig genome began in the early 1990s with the devel¬ 
opment of the international PiGMaP gene mapping project 
(Archibald et al., 1995) and efforts by the USDA and US agri¬ 
cultural universities (Rothschild et al., 1994; Rohrer et al., 
1996). A function of both the PiGMaP program and the USDA 
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Cooperative State Research Education and Extension 
(CSREES) program was that both were designed to provide a 
structure that included collaboration and cooperation. In the 
US a Pig Genome Coordinator position was developed and 
facilitated collaboration between scientists from state and pri¬ 
vate universities and federal labs that operated cooperatively in 
a Swine Genome Technical Committee, which has met yearly 
since 1994. The US Pig Genome Coordinator activities, in con¬ 
cert with activities of the USDA-ARS and international gene 
mapping projects, such as PiGMaP and others, have allowed 
the status of the pig gene map to evolve more quickly and devel¬ 
opments in functional genomics to advance rapidly in the last 
several years. 

Gene mapping 

The advances in the mid 1990s to produce the first maps 
have slowed in part but new gene markers consisting of micro¬ 
satellites, amplified fragment length polymorphisms (AFLPs) 
and single nucleotide polymorphisms (SNPs) continue to be 
identified and mapped and some integration of the maps con¬ 
tinues to take place as quantitative trait maps are expanded (see 


KARGER 


Fax+ 41 61 306 12 34 
E-mail karger@karger.ch 
www.karger.com 


© 2003 S. Karger AG, Basel 
0301-0171/03/1024-0095 $19.50/0 


Accessible online at: 
www.karger.com/cgr 




Table 1. Porcine genome statistics 


Table 2. QTL and candidate genes discovered 


Chromosomes 


Number 

18 autosomes, X, Y 

Genome size 


Predicted 

3 Mb 

Linkage map 


Genes 

-1,000 

Microsatellites 

-1,700 

AFLPs 

-2,300 

Physical map 


Somatic cell hybrid 

-900 genes and markers 

Radiation hybrid 

-3,000 genes and markers 

EST sequences 


In database 

-120,000 

In progress 

-700,000 

cDNA libraries 

>100 

Data predicted from combination of published and 
unpublished information. 


Table 1). The largest single map contains about 1,200 markers 
(Rohrer et al., 1996) but no new large-scale maps have been 
published recently. In total there are nearly 1,000 genes and 
1,700 markers in the database (www.thearkdb.org/browser? 
species=pig). There is a developing AFLP map with about 
2,300 AFLPs that is likely to be added to the PiGMaP linkage 
map some time in the future. Together these markers now total 
over 5,000 and individual maps can be roughly combined using 
markers found on more than one map. With the development 
and use of chromosome painting (Goureau et al., 1996), a 
somatic cell hybrid map (Yerle et al., 1996) and a 7,000 rad 
radiation hybrid (RH) panel (ImpRH) (Yerle et al., 1998; 
Hawken et al., 1999) integration of the linkage, cytogenetic and 
physical maps is well underway (Lahbib-Mansais et al., 2003). 
This RH map is growing rapidly and now contains nearly 3,000 
markers including microsatellites, and over 2,000 new ex¬ 
pressed sequence tags (ESTs) of which many are human ortho¬ 
logs, which enable comparative mapping (Rink et al., 2002b; 
Tuggle et al., 2003). Continued use of these resources and 
development of an advanced 12,000 rad RH map are underway 
(Yerle et al., 2002). In particular this will build a rapidly devel¬ 
oping comparative map which will accelerate the identification 
of the genes explaining variation in traits of interest by building 
on information from the sequenced genomes (e.g. human and 
mouse), either those identified by QTL studies or through 
direct approaches such as candidate gene association analyses. 


Database activities 

Informatics and databases provide the tools needed for 
future discoveries. Significant pig bioinformatics efforts have 
been initiated by the Roslin Institute, Scotland (www.thearkdb. 
org) and to a lesser extent in the US (www.genome.iastate.edu) 
to support the pig genome efforts and display the gene maps 
(Archibald and Law, 2003). PiGBASE, which can be reached 
through these sites, has several features including pig gene map¬ 
ping references with over 1,093 citations in the database and 
gene maps with about 2,600 loci. Last year there were over two 
million hits at these pig genome sites. Additional web sites exist 



Number of studies 3 

Number discovered 15 

Growth QTL 

15 

>85 

Fat QTL 

16 

>70 

Meat Quality 

10 

>100 

Reproduction QTL 

2 

>120 

Health related QTL 

3 

>10 

Candidate genes 

Traits 


HAL 

meat quality/stress 


KIT 

white color 


MC1R 

red/black color 


MC4R 

growth and fatness 


RN, PRKAG3 

meat quality 


FABP3, FABP4 

intramuscular fat 


CAST 

tenderness 


IGF2 

carcass composition 


ESR, PRLR, RBP4 

litter size 


FSHB 

reproduction 


FUT1, NRAMP1, SLA 

disease susceptibility 


Trade secret tests 

several traits 



a Best estimated from literature survey - see Bidanel and Rothschild (2002). 
b Summary of results from Bidanel and Rothschild (2002) and other literature, 
classifications for traits include: growth (birthweight, from birth to weaning, 
weaning to market, growth at some ages), fat (backfat measured at several body and 
carcass locations), meat quality (pH at several times, muscle characteristics, 
reflectance, water holding, sensory traits), reproduction (number born alive, born 
dead, number weaned, age at puberty, ovulation rate, embryo survival), health 
(stress measures, response to vaccination, challenge to limited diseases). 


for the cytogenetic map of the pig (http://www.toulouse.inra.fr/ 
lgc/pig/cyto/cyto.htm) and the RH panel map (http://www.tou- 
louse.inra.fr/lgc/pig/RH/Menuchr.htm). A comparative map is 
also on the web (http://www.toulouse.inra.fr/lgc/pig/compare/ 
compare.htm). In addition, a new EST database (http://pigest. 
genome.iastate.edu) has been developed and should become a 
similarly useful resource. It is now accessible and contains very 
nearly 100,000 pig EST entries and further development will 
continue. Other useful gene tools are available from the US pig 
genome web site (http://www.genome.iastate.edu). 


QTL and candidate genes 

Since pork contributes 43% of the world’s red meat con¬ 
sumed its production requires efficient growth rates, reduced 
feed intake, increased carcass merit and meat quality. Further¬ 
more, there must be high levels of reproductive success among 
breeding animals and disease resistance and increased surviva¬ 
bility in young pigs. Once initial maps were available, the US 
Pig Genome Coordinator supplied microsatellite primers to 
over 40 labs world wide. Researchers using these mapping 
resources and commercial and exotic pig breeds, were able to 
identify quantitative trait loci (QTL) affecting many of these 
traits (Table 2). A large number of QTL have been reported on 
nearly all chromosomes for growth, carcass and meat quality 
traits and several chromosomes for reproduction (Bidanel and 
Rothschild, 2002). The QTL affecting immune response traits 
and disease resistance are far less numerous. This is an area 
where gene expression approaches may be particularly valuable 
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in further unraveling of genetic causes of disease in pigs. Fol¬ 
lowing discoveries of imprinted genes in other species re¬ 
searchers have expanded their projects to find imprinted and 
origin of parent effects (De Koning et al., 2000). In particular, 
one such region on chromosome 2 has been intensively investi¬ 
gated (Georges et al., 2003) and IGF2 implicated in causing a 
major effect in muscle mass. Georges and colleagues cleverly 
employed a haplotype sharing strategy analysis combined with 
marker assisted segregation analysis to position the QTL within 
a 500-kb region. The causal quantitative trait nucleotide 
(QTN) was revealed after investigating over 180 SNPs and this 
work clearly points to the need for careful analysis of all gene 
regions and the proper animals and phenotypic information. 
Further evidence for imprinted regions and genes are likely to 
be found now that these approaches have been developed. 

Even before extensive QTL scans were completed, candi¬ 
date genes analyses had been employed (Rothschild and Soller, 
1997) to investigate a variety of traits. To date significant asso¬ 
ciations have been demonstrated for candidate genes (see 
Table 2) for litter size (ESR, PRLR, RBP4, FSHB), growth and 
backfat (MC4R), meat quality (PRKAG3, CAST), disease 
resistance (FUT1, SLA, NRAMP1), and coat color (KIT, 
MC1R) (reviewed in Bidanel and Rothschild, 2002). The com¬ 
mercial pig industry is actively using this gene marker informa¬ 
tion in combination with traditional performance information 
in marker-assisted selection programs to improve pig produc¬ 
tion. As QTL regions become clearly identified, positional can¬ 
didate gene analyses are being employed to elucidate other 
known QTL. In addition to the previously discussed IGF2 dis¬ 
covery other causative mutations in position candidates have 
been discovered. Two such recent cases involving meat quality, 
are the discovery of QTN mutations in PRKAG3 that affect 
pH and drip loss (Ciobanu et al., 2001) and in CAST that affect 
tenderness (Ciobanu et al., 2002). It is likely that as QTL exper¬ 
iments are expanded and the comparative map improves that 
additional positional candidates will be identified and the cau¬ 
sative QTN discovered. 

Sequencing efforts 

Considerable analysis of the porcine genome (Table 1) has 
provided strong evidence that it is of similar chromosomal 
organization (2n = 38, including meta- and acrocentric chromo¬ 
somes), size (3 x 10 9 bp), and complexity to the human genome. 
Recent efforts by many researchers have generated ESTs from 
cDNA clones randomly picked from libraries from many tis¬ 
sues. These EST projects have varied in size and in the tissues 
used (i.e. Wintero et al., 1996; Davoli et al., 2002; Rink et al., 
2002a; Yao et al., 2002; Nobis and Coussens, 2003). The largest 
of these types of projects published to date was sponsored by 
the USDA and reported the sequencing and initial analysis of 
66,245 ESTs (Fahrenkrug et al., 2002). In addition, 21,499 
sequences from reproductive tissue were produced by a consor¬ 
tium of several research groups (Tuggle et al., 2003). In Gen- 
Bank there are now approximately 120,000 sequences and in 
the October 2002 TIGR release there were 17,350 clusters and 
31,000 singletons. More deposits of 5,000 to 10,000 EST 


sequences are expected soon. Most importantly however, a 
major Sino-Danish effort to sequence the pig genome (http:// 
www.piggenome.dk/) has resulted in approximately 700,000 
EST sequences that are expected to be deposited in the data¬ 
base in the next several months. Sequencing of a large number 
of these ESTs will continue to help assist comparative mapping 
efforts, candidate gene discovery and expression analysis. 

The near completion of the human genome sequencing 
efforts has allowed for the consideration of other sequencing 
efforts to begin. Following the request of NIH, a number of 
species have submitted requests to be considered for sequenc¬ 
ing efforts. A swine genome community effort produced a 
“White Paper” (Rohrer et al., 2003) that was submitted to 
NHGRI recently that outlined the role the pig plays agricultu¬ 
rally as well as a model for human biology. The White Paper 
received very solid backing from colleagues from several coun¬ 
tries and from industry personnel from many companies and 
organizations. It received a “high priority status” but must 
await sufficient offers to fund the project. A cooperative project 
to develop a BAC map using the existing BAC library resources 
with approximately 35 X coverage (Rohrer et al., 2003) has 
progressed in a timely manner. 

Functional analysis 

To “make the sow’s ear into a silk purse” we clearly must 
better understand the physiological complexity of the pig trans- 
criptome, and expression and/or functional gene analysis needs 
to be undertaken. A limited number of genes and techniques 
such as Northern analysis and differential display PCR were 
first explored by researchers (Wilson et al., 2002; Gladney et 
al., 2004). More recently, other approaches have included 
quantitative real time PCR to determine mRNA levels for 
immune response and disease infection levels (Dawson et al., 
2003; Okomo-Adhiambo et al., 2003). These techniques, while 
quite useful, have proved to be limited in the numbers of genes 
that can be considered. Other approaches have included use of 
limited numbers of cDNA on macroarrays (Zhao et al., 2003). 
Given the initial lack of development of large scale cDNA 
arrays for the pig, human arrays have been tested and used 
(Moody et al., 2002; Gladney et al., 2004). These initial experi¬ 
ments with such materials have proved quite valuable as repro¬ 
ducibility was generally high and results were reasonable. The 
outstanding progress to produce large numbers of pig ESTs has 
now allowed for large scale expression analysis using porcine 
materials only. In one of the first studies, Pomp and colleagues 
(Caetano et al., 2002; Caetano et al., 2003) have used cDNA 
derived from ovary and follicular RNA from animals from 
either an index line selected for higher litter size or a control 
line and co-hybridized them with 4,600 follicle-derived probes 
to study gene expression patterns related to reproductive effi¬ 
ciency. Other projects exist including two large-scale efforts in 
Europe. The first European Community supported project is 
called PathoCHIP (http://www.pathochipproject.com) and 
uses spotted cDNA arrays for disease organism and immune 
response genes while the second, called QualityPorkGENES 
(www.qualityporkgenes.com) looks at the co-expression of 
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genes related to meat quality. Cooperative efforts by the US Pig 
Genome Coordinator and US and International researchers 
have now been directed to developing a first stage cDNA or 
oligo spotted array for the pig genome community. It is 
expected that such an array will be commercially available in 
mid 2003. This array and others to be developed later will 
advance functional analysis in the pig. 

Pigs in biomedical research and transplantation 

Biomedical interest in the pig as a model for human biology 
has existed for several years (Tumbleson and Schook, 1996). 
This research has covered nutrition, digestive physiology, kid¬ 
ney and heart function, diabetes and other forms of obesity. 
One such example is the discovery of a mutant MC4R gene in 
pigs causing obesity similar to humans (Kim et al., 2000). 

Shortages of tissues and organs world wide has increased 
interest in xenotransplantation. Because of its size and physiol¬ 
ogy the pig is seen as a preferred donor. Recent concerns about 
retroviruses and the difficulty in creating transgenic pigs that 
meet all the needed requirements for save transplantation have 
slowed progress in this area and several major companies con¬ 
ducting such research have scaled back their active research. 


Conclusions 

The pig genome offers new insights for both agricultural 
purposes and for its importance to human biomedical concerns 
and understanding it remains a significant challenge. Large- 
scale gene and trait identification and mapping have taken 
place and a number of gene tests to improve pork production 
are in use in the pig industry. Sequencing and expression analy¬ 
sis has been initiated and offers new avenues to understand the 
biological complexity of the pig. The pig genomics community 
need not now rely solely on developments from other organ¬ 
isms such as the human and the mouse. We now have the 
opportunity to make the proverbial silk purse from a sow’s ear. 
The state of the art of pig gene discovery and functional genom¬ 
ics clearly demonstrate such commitment and progress. 
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Abstract. In this study we examined homologies between 
1,735 porcine microsatellites and human sequence. For 1,710 
microsatellites we directly used the sequence flanking the 
repeat available in GenBank. For a set of 305 microsatellites, a 
BAC library was screened and end-sequencing provided 461 
additional sequences. Altogether 2,171 porcine sequences were 
tentatively aligned with the sequence of the human genome 
using the fasta program. Human homologies were observed for 
652 microsatellite loci and porcine chromosome assignments 
available for 623 microsatellites provide useful links in the 


human and pig comparative map. Moreover for 92 STS, a sig¬ 
nificant sequence similarity was detected using at least two 
sequences and in all cases corresponding human locations were 
consistent. The present study allowed the integration of anony¬ 
mous markers and the porcine linkage map into the framework 
of the comparative data between human and porcine genomes 
(http://w3.toulouse.inra.fr/lgc/pig/msat/). Moreover all con¬ 
served syntenic segments were defined on human chromo¬ 
somes. 

Copyright©2003 S. Karger AG, Basel 


In the last few years, construction of mapping tools, such as 
radiation hybrid panels (Yerle et al., 1998, 2002), and the 
development of cDNA libraries in pigs, has been important for 
the identification and mapping of genes in this species. How¬ 
ever the number of mapped loci needs to be increased in order 
to improve the resolution of the comparative mapping between 
human and porcine species, which is a prerequisite for posi¬ 
tional cloning strategies of quantitative trait loci (QTL). Bi¬ 
directional painting has provided a framework of conserved 
synteny groups conserved between the pig and human (Gour- 
eau et al., 1996). Moreover gene mapping has confirmed the 
large-scale correspondences identified between human and pig 
(Lahbib-Mansais et al., 1999, 2000, 2003; Pinton et al., 2000; 
Rink et al., 2002). 
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Currently more than 1,200 markers have been mapped on 
porcine genetic linkage maps (Rohrer et al., 1994, 1996; Archi¬ 
bald et al., 1995; Marklund et al., 1996). Moreover mapping on 
the radiation hybrid panel IMpRH (Yerle et al., 2002) is in full 
swing (Hawken et al., 1999; Karnuah et al., 2001; Korwin-Kos- 
sakowska et al., 2002; Krause et al., 2002; Rink et al., 2002; 
Lahbib-Mansais et al., 2003) and 6000 markers are already 
mapped. A first generation RH comparative map of the porcine 
and human genome has already been published (Rink et al., 
2002; Lahbib-Mansais et al., 2003) but this map includes only 
EST markers. Since the human genome project is nearly fin¬ 
ished, comparative mapping approaches using the human in¬ 
formation greatly facilitate the construction of physical maps in 
other mammalian species. Moreover the incorporation of non¬ 
coding sequence into the comparative mapping framework 
could be now considered even if this strategy seems laborious. 

Farber and Medrano (2003) developed an approach to iden¬ 
tify homologies existing between livestock microsatellite flank¬ 
ing sequences (STS sequence) and the human genome se¬ 
quence. In this study we present two systematic approaches to 
identify genome-wide relationships between microsatellite loci 
and human sequence for 1,735 porcine microsatellite loci. This 
study allowed characterization of 623 new anchoring loci on 
the human and pig comparative map. 
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Table 1 . Evaluation grid to select hit after a fasta analysis. Selections were based on the percentage of 
identity. The minimal rate is a function of the length of the alignment detected by the fasta program. The 
second lines (italics) allowed taking into account gaps which were accepted only for large alignments. 
This last possibility was not used for BAC-end sequences. Indeed they did not contain microsatellites 
which were masked and which were suitable for inducing a major gap inside the alignment. The exam¬ 
ples suggested are fictitious. 


Description of hits obtained by fasta analysis 


Alignment 

(bp) 

% Identity 
minimal 

% Ungapped 
minimal 

40 

88 

88 

50 

84 

84 

60 

80 

80 


76 

82 

70 

78 

78 


74 

82 

80 

16 

76 


72 

78 


90 

74 

74 

Examples 



66 

76 

90% identity (90% ungapped) in 72 nt overlap 

hit retained 

100 

72 

72 

74% identity (83% ungapped) in 72 nt overlap 

hit retained 


64 

74 

70% identity (83% ungapped) in 72 nt overlap 

hit not retained 

110 

70 

70 

73% identity (83% ungapped) in 78 nt overlap 

hit retained 


60 

72 



120 

68 

68 




58 

70 



120 

66 

66 




56 

68 



140 

65 

65 




54 

66 



>150 

65 

65 




50 

65 




Materials and methods 

Microsatellites 

Porcine microsatellites flanking sequences were downloaded from the 
GenBank sequence database using the Entrez nucleotide query website 
(http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=Nucleotide). 

Comparative sequence analysis 

RepeatMasker with option “other mammals” was used to mask simple 
repeat and interspersed repetitive elements from each sequence set (Smit and 
Green at http://ftp.genome.washington.edu/RM/repeatMasker.html). Each 
masked sequence set was queried against the human genome sequence (pri¬ 
mate division of GenBank, January 2003) using fasta program (Pearson and 
Lipman, 1988; software version 3.3t09, May 2001). A first script was written 
to automate the masking and fasta analysis. A second script was written to 
extract results from the fasta output. The table was sorted and all matches 
with an expected value (e-value) > 10 -3 were discarded. This table contained 
all hits found for each sequence. When the number of hits was low (< 8) only 
the first was retained. When the number of hits was very high (between 50 
and 3,000) and/or when the probability of the first was similar to the proba¬ 
bility of others we retained no hits. We developed an evaluation grid to retain 
or discard hits and four examples are given to illustrate this selection (Ta¬ 
ble 1). 

Match annotation 

Whereas to speak about the contents of the fasta analysis output file we 
employ the term “hit” and prefer to reserve the term of “match” to indicate 
the selected and annoted hit. 

Porcine microsatellite map locations were identified from GenBank 
annotation, or from available information on ARKdb genome databases (Hu 
et al., 2001, http://www.thearkdb.org), or on the USDA database (http://sol. 
marc.usda.gov/), or on the IMpRH database (Milan et al., 2000; http:// 
imprh.toulouse.inra.fr/). 


Locations on the human genome were extracted from the UCSC human 
genome assembly of November 2002 (http://genome.cse.ucsc.edu/) or from 
the OMIM database (http://www.ncbi.nlm.nih.gov/Omim/). 

To validate matches, known conserved syntenic relationships were 
examined using information available at http://www.toulouse.inra.fr/lgc/pig/ 
compare/SSC.htm. 


Results 

A total of 1,735 microsatellites were analyzed in this study. 
For 1,710 microsatellites, STS sequences (including flanking 
sequences) were downloaded from GenBank. BAC end se¬ 
quences were determined for 323 clones screened from the 
INRA library (Rogel-Gaillard et al., 1999) with 305 micro¬ 
satellites providing 461 additional sequences (GenBank 
BX465382-BX465833 and BX511324-BX511332). 

In total 2,171 sequences (1,710 + 461) were then compared 
to the human genome sequence with a fasta process with a 
threshold for the e-value of 10 -3 . Approximately 25,000 hits 
were obtained and after the selection of the best hit for each 
comparison, we obtained approximately 1,000 hits (description 
in Materials and methods). Usually hits with e-values > 10~ 5 are 
considered as non-significant and are discarded. We empirical¬ 
ly developed an evaluation grid to select hits on the identity 
percentage (Table 1). After this manual selection we retained 
830 hits, and among them 25% had an e-value >10~ 5 . The 
human location was determined for all 830 human sequences 
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Table 2. All scores obtained in this study. 


Hits 


Microsatellites 


33 a 

36 

503 

258 


hits with an e- value >10' 5 + incompatiblity with current knowledge of 
comparative map 

hits are incompatible with current knowledge of comparative map 
hits found using STS sequence 


hits found using BAC extremities (in 52 cases, the homology was 
obtained with two or three BACs sequences characterizing the same 
microsatellite) 



33 

they were discarded 

36 

they were not considered 


623 with porcine localization 

652 b 

= 623 anchor loci on the 


comparative map 


29 without porcine localization 


830 Total 


a These hits were not reported in Tables. 

b We find the number of homologies detected using the two strategies (503 + 203 - 652 = 54). 


implicated in these 830 hits as described in Materials and 
methods. 

Knowing conserved syntenic relationships between human 
and porcine genomes, a first examination of results made it 
possible to discard 33 matches which were not compatible with 
current comparative data and whose e-values were not low 
enough to be considered significant (> 10~ 5 ). On the other hand, 
only 36 homologies with significant e-values (< 10~ 5 ) were not 
considered. These results were not discarded but only not con¬ 
sidered to define conserved syntenic fragments: they appear in 
all Table results. These 36 matches were not compatible with 
current knowledge of the comparative map and 30 of 36 were 
only mapped on the IMpRH panel (Korwin-Kossakowska et 
al., 2002; Krause et al., 2002). Moreover if these 36 matches 
were considered, they would characterize 36 new segments of 
homology. After the final elimination of 33 matches and the 
not-considered 36 others, there remain 761 matches which 
relate to 652 microsatellites (Table 2). All were validated but 
porcine localization was not available for 29 microsatellites and 
these corresponding matches were not informative. Therefore 
623 new anchors on the comparative map were characterized. 

Using STS sequences, 503 points of orthology were charac¬ 
terized (29 % of cases). With the second strategy 323 BACs were 
sequenced for one or two extremities with 513 bp available on 
average for each clone (Iannuccelli et al., in preparation). We 
obtained a corresponding human location for 209 BACs and 
203 microsatellites (score = 65%). When more than one BAC- 
end sequence was available for one microsatellite, correspond¬ 
ing deduced human locations were always in accordance (52 
cases). Moreover for 54 microsatellites, orthologous positions 
were detected using the two strategies (STS/BAC-end) and they 
were always consistent. Among the 92 cases (52+54) where we 
found a significant homology with at least two porcine se¬ 
quences characterizing the same microsatellite, 30/33 hits ob¬ 
tained with an e-value > 10~ 5 were confirmed by a second (ob¬ 
tained with a second porcine sequence) with a very significant 
probability (< 10" 10 ). Therefore human homology was obtained 
for 652 porcine microsatellites but 29 STS were not assigned on 
the porcine genome and only 623 homology points were charac¬ 
terized and exploitable. 


These 623 new points of homology between porcine and 
human genomes were compatible with current knowledge of 
the comparative map available at http://www.toulouse.inra.fr/ 
lgc/pig/compare/SSC.htm. In particular our results allowed a 
confirmation of correspondence between SSC3 and HSA7 
(three matches), between SSC10 and HSA9 (four matches), 
between SSC14 and HSA9 (one match), between SSC15 and 
HSA4 (one match), between SSC15 and HSA8 (four matches), 
and between SSC 17 and HSA4 (two matches). 

To illustrate different categories of results obtained, we 
present hits observed for porcine microsatellite mapped from 
SSC1 (Table 3). Seventy seven hits were observed for 68 micro¬ 
satellites. When several sequences were examined, similar 
results were obtained. For example, for S0320, three BAC-end 
sequences and the STS sequence matched with two human 
sequences on HSA9 with approximate positions of 113.26 and 
113.36 Mb. These match results are consistent each other and 
this homology point is also compatible with present knowledge 
of the comparative map and so this result has been validated. 
As seen in Table 3 we obtained several hits with e-values > 10~ 5 . 
When these hits were compatible with current knowledge of the 
comparative map (SW1997, SW1020, SW373-) they were 
retained. On the other hand when we have no additional argu¬ 
ments to retain these hits, they were discarded (SW1824, 
SW1957). Four microsatellites (UMNp431, UMNp373, 
UMNp528, UMNp616) harbored an e-value sufficient to be 
considered as sure but these four matches were not compatible 
with current comparative genome data. These results were not 
discarded but only not considered. Therefore 65 microsatellites 
originating from SSC1 had at least one significant match to 
human genomic sequence and were grouped in five chromo¬ 
somal segments on HSA6, HSA9, HSA14, HSA15 and HSA18 
(Table 3). Among these 65 matches only 39 have been pre¬ 
viously mapped on the USDA linkage map (http://sol.marc. 
usda.gov/). The comparative map including the porcine linkage 
map and the human physical map (Fig. la) was compatible 
with information available at http://www.toulouse.inra.fr/lgc/ 
pig/compare/table.htm. Nevertheless the conserved syntenic 
fragment between SSC1 and HSA18 which had already been 
detected was confirmed and enlarged on the q-arm of SSC 1. 
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Table 3. 77 hits involving 68 porcine microsatellites localized on SSC1. When at least two porcine sequences characterizing a same microsatellite were 
used, hits are indicated by a vertical connecting line. Hits were classified by the position of the homology point on a human chromosome. 


Porcine Porcine linkage Porcine 


microsatellite 

position sequence 4 

(USDA, cM) 

SW1957 b 


AF253752 

UMNp431 c 


AF511275 

microsatellite 


AJ277795 

SW1653 

49.4 

AF253682 

S0396 


U78022 

SW2185 

67.6 

BX465721 

SW1430 

58.5 

AF253621 

S0317 


X77282 

S0318 


X77281 

SW1123 

52.9 

AF225089 

SW1997 

53.6 

AF253768 

UMNpl92 


AF511132 

SW952 | 

56.5 

BX465473 

SW952 j 

56.5 

BX465474 

M. triadin 


AJ224992 

SW1851 

44.6 

AF225128 

S0008 


43.5 

M97235 

S0008 


43.5 

BX465652 

UMNp467 


AF511298 

sZ002 


AF279701 

UMNpl60 


AF375756 

SW1332 

29.2 

AF253594 

SW137 

23.5 

AF235212 

SW64 

23.5 

AF225100 

UMNp380 


AF511243 

SWR485 

16.4 

AF235454 

SW2184 


AF253814 

UMNp373 c 


AF511238 

HY-N13 


AB050040 

HY-N26 


AB050046 

S0020 


83.2 

BX465393 

S0020 


83.2 

BX465555 

SO 142 

83.2 

BX465393 

SW1970 

83.2 

AF253756 

SW780 


BX465635 

UMNp55 


AF375685 

SW1020 

83.7 

AF253566 

SW2551 

95.8 

AF225182 

UMNp484 


AF511312 

H Y-N15 


AB050042 

SW1462 


93.9 

BX465776 

SW1462 


93.9 

AF253631 

HY-N19 


AB050043 

SW974 

102.9 

AF225111 

S0302 


102.9 

BX465686 

S0302 


102.9 

U10321 

S0354 

104.5 

L29228 

HY-N6 


AB050036 

UMNp367 


AF511235 

S0320 


112.5 

BX465446 

S0320 


112.5 

BX465438 

S0320 


112.5 

BX465447 

S0320 



X77284 

SW705 

122.6 

AF235342 

SW1301 


140.5 

BX465487 

SW1301 


140.5 

BX465488 

HYN1 


BX465502 

SW1824 b 


AF225127 

UMNp528 c 


AF511339 

UMNp616 c 


AF511394 

UMNp41 


AF375673 

S0313 


X76937 

SW962 

80.5 

AF235495 

SW216 

82.4 

BX465710 

sZ003 d 


AF279702 

UMNp29 d 


AF511181 

UMNp486 


AF511314 

UMNp256 


AF511157 

UMNp 117 


AF375732 


Human 

sequence 

HSA 

Approximate 
position (Mb) 

Expected 

Value 

AC108153 

4 

53.9 

0.0083 

AC079768 

4 

171.21 

2.2e-l 1 

AL590668 

6 

49.41 

3.1e-23 

AL109922 

6 

65.67 

1.3e-05 

AL357507 

6 

74.86 

2.2e-09 

AL049696 

6 

80.95 

0.00057 

AL359715 

6 

81.09 

2.1e-05 

AL590143 

6 

85.36 

2.8e-15 

AL590143 

6 

85.36 

2.8e-15 

AL590392 

6 

88.58 

5.7e-09 

AL592428 

6 

92.58 

0.0015 

AL512490 

6 

108.16 

2.9e-13 

AL500524 

6 I 

114.56 

2.2e-27 

AL021327 

6 I 

114.74 

3.3e-17 

AL603902 

6 

123.37 

3.6e-06 

AL357274 

6 

130.79 

5.2e-05 

AC005587 

6 


131.90 

1.0e-08 

AC005587 

6 


131.90 

2.9e-09 

AL589674 

6 

142.22 

1.3e-34 

AL049844 

6 

144.05 

6.4e-05 

AL109755 

6 

144.17 

3.1e-26 

AL023283 

6 

145.41 

0.00032 

AL359252 

6 

149.01 

3.8e-08 

AL359252 

6 

149.01 

3.5e-14 

AL078582 

6 

152.32 

9.3e-38 

AL589963 

6 

152.74 

0.00015 

AL078583 

6 

161.98 

4.5e-08 

AC005686 

7 

33.94 

9.4e-07 

AL591644 

9 

1.94 

6.7e-15 

AL353741 

9 

6.15 

4.9e-13 

AL160053 

9 


16.33 

3.7e-44 

AL160053 

9 


16.33 

7.3e-05 

AL160053 

9 

16.33 

3.7e-44 

AL353895 

9 

18.86 

7.3e-08 

AL133281 

9 

19.86 

3.5e-22 

AL512635 

9 

20.46 

6.4e-10 

AL451137 

9 

26.94 

0.0066 

AL161781 

9 

37.15 

2.1e-09 

AL 162412 

9 

64.21 

0.0093 

AL158154 

9 

76.35 

6.6e-06 

AL357032 

9 


76.75 

1.4e-07 

AL 162726 

9 


76.91 

7.5e-09 

AC068050 

9 

102.53 

2.4e-23 

AL358779 

9 

103.25 

5.6e-07 

AL162733 

9 


104.01 

9.6e-05 

AL162733 

9 


104.01 

1.8e-07 

AL139041 

9 

107.41 

1.4e-22 

AL359455 

9 

108.00 

3.7e-12 

AL161630 

9 

112.00 

5.1e-09 

AL157780 

9 


113.26 

4.5e-20 

AL512602 

9 


113.36 

3.5e-29 

AL512602 

9 


113.36 

3.7e-19 

AL512602 

9 


113.36 

5.2e-12 

AL137846 

9 

118.82 

1.3e-06 

AL354855 

9 


125.76 

3.4e-l 1 

AL358781 

9 


125.89 

6.6e-22 

AL513102 

9 

126.48 

3.9e-13 

AC 117502 

12 

69.7 

0.0025 

AC084881 

12 

119.34 

8.6e-09 

AL 16245 5 

13 

88.91 

7.8e-34 

AL138498 

14 

36.15 

0.00002 

AL161664 

14 

40.41 

2.3e-17 

AL049874 

14 

54.87 

0.0057 

AF215937 

14 

58.50 

1.9e-55 

AL132641 

14 

79.95 

0.0021 

AL358292 

14 

81.33 

6.6e-05 

AC091074 

15 

39.71 

6.8e-12 

AC024061 

15 

46.92 

1.0e-10 

AC018618 

15 

55.06 

9.8e-05 


Description of alignment 


69.369% identity (72.642% 
70.701% identity (72.549% 
67.708% identity (79.592% 
59.574% identity (71.795% 
72.165% identity (76.503% 
67.568% identity (67.568% 
75.000% identity (80.488% 
74.390% identity (75.776% 
74.390% identity (75.776% 
80.851% identity (81.720% 
52.217% identity (72.109% 
70.335% identity (73.869% 
75.325% identity (77.852% 
72.000% identity (73.303% 
75.439% identity (78.182% 
78.462% identity (78.462% 
69.444% identity (73.529% 
79.798% identity (81.443% 
77.936% identity (79.348% 
71.560% identity (74.286% 
73.294% identity (78.165% 
80.328% identity (85.965% 
66.364% identity (68.545% 
73.410% identity (76.506% 
70.757% identity (72.267% 
78.571% identity (79.710% 
90.278% identity (90.278% 
57.143% identity (71.429% 
69.430% identity (72.043% 
69.608% identity (72.821% 
75.245% identity (78.920% 
73.118% identity (75.556% 
75.245% identity (78.920% 
55.729% identity (79.851% 
73.874% identity (74.886% 
71.429% identity (77.957% 
76.543% identity (78.481% 
87.013% identity (89.333% 
65.432% identity (71.622% 
67.722% identity (74.306% 
85.057% identity (87.059% 
83.871% identity (84.783% 
76.232% identity (79.697% 
81.481% identity (88.000% 
75.269% identity (76.923% 
73.451% identity (74.107% 
80.588% identity (83.537% 
64.122% identity (68.852% 
73.950% identity (75.862% 
67.333% identity (68.942% 
72.107% identity (75.000% 
72.107% identity (75.000% 
60.891% identity (75.926% 
70.732% identity (74.359% 
69.744% identity (71.958% 
80.503% identity (81.013% 
67.511 % identity (71.749% 
80.000% identity (80.000% 
71.324% identity (73.485% 
66.728% identity (69.157% 
74.790% identity (77.391% 
67.949% identity (75.177% 
73.171% identity (77.922% 
78.555% identity (81.882% 
63.298% identity (67.614% 
76.923% identity (76.923% 
73.288% identity (74.306% 
73.418% identity (73.418% 
72.951% identity (79.464% 


ungapped) in 111 nt overlap 
ungapped) in 157 nt overlap 
ungapped) in 288 nt overlap 
ungapped) in 141 nt overlap 
ungapped) in 194 nt overlap 
ungapped) in 111 nt overlap 
ungapped) in 88 nt overlap 
ungapped) in 164 nt overlap 
ungapped) in 164 nt overlap 
ungapped) in 94 nt overlap 
ungapped) in 203 nt overlap 
ungapped) in 209 nt overlap 
ungapped) in 308 nt overlap 
ungapped) in 225 nt overlap 
ungapped) in 114 nt overlap 
ungapped) in 65 nt overlap 
ungapped) in 144 nt overlap 
ungapped) in 99 nt overlap 
ungapped) in 281 nt overlap 
ungapped) in 109 nt overlap 
ungapped) in 337 nt overlap 
ungapped) in 61 nt overlap 
ungapped) in 220 nt overlap 
ungapped) in 173 nt overlap 
ungapped) in 383 nt overlap 
ungapped) in 70 nt overlap 
ungapped) in 72 nt overlap 
ungapped) in 175 nt overlap 
ungapped) in 193 nt overlap 
ungapped) in 204 nt overlap 
ungapped) in 408 nt overlap 
ungapped) in 93 nt overlap 
ungapped) in 408 nt overlap 
ungapped) in 192 nt overlap 
ungapped) in 222 nt overlap 
ungapped) in 203 nt overlap 
ungapped) in 81 nt overlap 
ungapped) in 77 nt overlap 
ungapped) in 162 nt overlap 
ungapped) in 158 nt overlap 
ungapped) in 87 nt overlap 
ungapped) in 93 nt overlap 
ungapped) in 345 nt overlap 
ungapped) in 81 nt overlap 
ungapped) in 93 nt overlap 
ungapped) in 113 nt overlap 
ungapped) in 170 nt overlap 
ungapped) in 262 nt overlap 
ungapped) in 119 nt overlap 
ungapped) in 300 nt overlap 
ungapped) in 337 nt overlap 
ungapped) in 337 nt overlap 
ungapped) in 202 nt overlap 
ungapped) in 123 nt overlap 
ungapped) in 195 nt overlap 
ungapped) in 159 nt overlap 
ungapped) in 237 nt overlap 
ungapped) in 65 nt overlap 
ungapped) in 136 nt overlap 
ungapped) in 541 nt overlap 
ungapped) in 119 nt overlap 
ungapped) in 312 nt overlap 
ungapped) in 82 nt overlap 
ungapped) in 443 nt overlap 
ungapped) in 188 nt overlap 
ungapped) in 78 nt overlap 
ungapped) in 146 nt overlap 
ungapped) in 158 nt overlap 
ungapped) in 122 nt overlap 
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Table 3 (continued) 


Porcine 

microsatellite 

Porcine linkage 
position 
(USDA, cM) 

Porcine 

sequence 3 

Human 

sequence 

HSA 

Approximate 
position (Mb) 

Expected 

Value 

Description of alignment 


SW2073 

79.4 

AF253788 

AC016355 

15 

60.89 

2.7e-18 

89.744% identity (92.920% ungapped) in 117 nt overlap 


HY-N21 


AB050044 

AC087593 

15 

82.12 

2.2e-20 

71.318% identity (74.194% ungapped) in 258 nt overlap 


UMNp84 


AF375710 

AC069029 

15 

93.27 

3.4e-24 

75.745% identity (79.111% ungapped) in 235 nt overlap 


SW373 

119.5 

AF225095 

AC090916 

18 

35.81 

0.00049 

81.667% identity (83.051% ungapped) in 60 nt overlap 


SW1668 

60.2 

AF253686 

AC023421 

18 

43.04 

1.6e-08 

73.276% identity (78.704% ungapped) in 116 nt overlap 


UMNp345 


AF511218 

AC018994 

18 

53.10 

4.4e-16 

81.633% identity (83.333% ungapped) in 147 nt overlap 


UMNp330 


AF511204 

AC 107990 

18 

57.51 

l.le-09 

55.450% identity (68.824% ungapped) in 211 nt overlap 


S0331 


L36911 

AC011930 

18 

67.75 

7.7e-06 

80.000% identity (83.333% ungapped) in 75 nt overlap 



a BAC-end sequences are indicated in italics. 

b Two hits were discarded because they harbored too high e-values and were incompatible with current knowledge of comparative maps. 
c Four hits were not compatible with current knowledge of comparative maps and were not retained. 
d Hits considered as provisional. 


SSC1 (USDA linkage map) 
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Fig-1 . Visualization of conserved syntenic fragments between human genome and SSC1. (a) The order of orthologous loci is 
compared between the USDA linkage map and human physical map. (b) Visualization of human homologies on USDA linkage 
map of SSC1. Some markers localized on the porcine cytogenetic map were added to the USDA linkage map (at the left of the 
chromosome drawing). The visualization of human homologies on the USDA linkage map is available for all porcine chromosomes 
on http://w3.toulouse.inra.fr/lgc/pig/msat/. 
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All homology data are reported in tables sorted by porcine 
or human chromosomes and are available at http://w3.tou- 
louse.inra.fr/lgc/pig/msat/. When linkage data is available 
(60% of markers) it is possible to show conserved syntenic frag¬ 
ments on porcine linkage maps. All figures (e.g. Fig. lb) are 
available at http://w3.toulouse.inra.fr/lgc/pig/msat/. When we 
examined all results sorted by human chromosomes, we were 
able to establish the precise location of the conserved syntenic 
breakpoints on human chromosomes. For example we are able 
to determine that the human chromosomal segment from 
HSA14 homologue to SSC1 was divided into two sub-seg¬ 
ments. This observation is not concordant with Rink et al. 
(2002) and was based on two matches having a high e-value. 
Consequently, this segmentation in two sub-segments could be 
considered as provisional. Figure 2 allows visualization on 
human chromosomes of all conserved syntenic segments found 
here. The same figure is available in color at http://w3.toulouse. 
inra.fr/lgc/pig/msat/. 

Discussion 

Until now, results have been accumulating on the compara¬ 
tive map between the porcine and human genome due to the 
mapping of ESTs. Radiation hybrid mapping is included in this 
comparative map but it is very difficult to use the linkage map 
without the use of intermediate cytogenetic or RH maps. Here 
we propose a new strategy to include genetic markers in the 
comparative data. 

We did not use a selection of hits based only on the e-value. 
We selected results including 25% with an e-value >10 -5 . In 
sequence alignments those hits are generally discarded and here 
we tried to save results obtained with short sequences. We 
empirically developed an evaluation grid. Selections were 
made using the percentage of identity but the minimal rate was 
a function of the length of the alignment. Gaps inside the align¬ 
ment were accepted only for large matches. Obtaining 92 con¬ 
sistent results by at least two different porcine sequences and 
especially 30 hits observed with a non-significant e-value 
(>10~ 5 ) and confirmed by a very significant second hit, 
showed the interest of this evaluation grid. Nevertheless, this 
grid is empirical and 15% of hits having an e-value >10 -5 (33 
discarded) and a maximum of 6% (36 not considered) of hits 
with an e-value <10~ 5 were wrongly selected. Methodology 
employed must give a maximum of results entering within the 
framework of current knowledge of the comparative map to 
make a safe description of minimum innovations. 

We did not retain 36 porcine markers with matches on the 
human genome with e-values sufficient to be considered as 
sure. These results were incompatible with present data accu¬ 
mulated on the comparative map between human and porcine 
genomes. Moreover none of these matches have been con¬ 
firmed by another available match here or elsewhere. Therefore 
if these 36 matches were considered, they would characterize 
36 new segments of homology. For most of them, the chromo¬ 
somal assignment on IMpRH was obtained with a low LOD 
score. There is too much doubt about these 36 markers to say 
that 36 new conserved syntenic fragments were identified. Con¬ 


sequently they will not be considered until there is an eventual 
new chromosomal assignment or a confirmation by others. 
Consequently the grid used here to select hits is probably more 
effective than we are able to show (15 and 6 % wrong selections 
with respectively e-values >10~ 5 and <10 -5 ). Eventual novel 
conserved syntenic fragments are perhaps included among 
these 36 matches and they were included in all tables. 

All chromosome correspondences identified by chromo¬ 
somal painting were confirmed. Among correspondences alrea¬ 
dy characterized or suspected by other authors six were de¬ 
tected in our 623 results. 

We identified human orthologous positions for 65 porcine 
microsatellite loci originated from SSC1 but only 39 could be 
used to produce a bi-directional comparative map (Fig. 1). 
These results are compatible with bi-directional painting 
(Goureau et al., 1996) and allow the definition of conserved 
syntenic segments on the linkage map of this porcine chromo¬ 
some. The conserved syntenic fragments SSC1/HSA14 and 
SSC1/HSA15 appeared localized near SSClq2.1, but the link¬ 
age map of this region did not allow a good exploration of the 
segment q2.2-q2.7. On the other hand the dispersion of the con¬ 
served synteny SSC1/HSA18 along the q-arm is confirmed. 
Moreover we were able to study the order of loci inside con¬ 
served syntenic chromosomal segments of HSA6 and HSA9. 
The loci order is conserved only inside small sub-regions and it 
appears very important to increase the density of loci on the 
comparative map to avoid concluding too quickly on the con¬ 
servation of the order of loci. 

This study allowed the integration of the porcine linkage 
map in the framework comparative map. This approach is very 
useful for QTL studies to avoid using RH mapping as an inter¬ 
mediate step. Figures including new results of comparative 
maps anchored on each porcine chromosome linkage maps are 
available on the web. It would be tedious to describe each con¬ 
tribution of the integration of the porcine linkage map in the 
framework of comparative mapping, therefore only one has 
been detailed here. Homologies have already been detected 
between loci from the p-arm of SSC5 and HSA22 (Rink et al., 
2002; Lahbib-Mansais et al., 2003). We reported a match 
between SW152 and a human sequence originating from 
HSA22. The sub-region of SSC5 around SW152 was not “at¬ 
tributed” on the comparative map of Lahbib-Mansais et al. 
(2003) and it might be possible that a new conserved syntenic 
fragment has been characterized here. 

It is not possible to integrate all results in comparative maps 
initialized on porcine chromosomes because markers have 
originated from several porcine maps. On the other hand all 
results are useful and are used to initialize a comparative map 
on human chromosomes. Moreover when we examined all 
results sorted by human chromosome (Table available on the 
web at http://w3.toulouse.inra.fr/lgc/pig/msat/), we were able to 
improve the precision of the localization of the conserved synt¬ 
enic breakpoints on each human chromosome. Figure 2 al¬ 
lowed visualization on human chromosomes of all conserved 
syntenic segments found here. Contrary to Fig. 1, this represen¬ 
tation is not a punctual drawing of found homologies: we con¬ 
nected the identical homologies for better visualizing con¬ 
served syntenic fragments. The risk is to include some sub- 
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Fig. 2. Conserved syntenic breakpoints determined on human chromosomes. The existence of one segment is provisional 
(indicated by *). This same figure in color is available on the web at http://w3.toulouse.inra.fr/lgc/pig/msat/. 
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chromosomal regions in a segment of conserved synteny with¬ 
out any point of anchoring (50 Mb for example for HSA1). 
Nevertheless we did not detect important disagreements with 
bi-directional painting descriptions (Goureau et ah, 1996). Our 
results were concordant with all published results for HSA3, 
HSA13, HSA17, HSA20, HSA21 and HSAX. On HSA2, 
HSA5, and HSA11 we describe a simpler situation than that 
reported by Rink et al. (2002): no overlaps were detected. On 
HSA16 and HSA19, we did not find overlaps but the covering 
of these chromosomes is not complete here (Fig. 2). For HSA4, 
HSA6, HSA10, HSA14, HSA15 and HSA18 our results allowed 
slightly different definitions of chromosomal fragments than 
results published by Rink et al. (2002). After a new analysis of 
results (positions on human sequence) reported by Nonneman 
and Rohrer (2003), MAPK8 confirmed the existence of the 
short sub-segment in the correspondence HSA10/SSC10 char¬ 
acterized here by only one result (near 48 Mb on HSA10). Com¬ 
parative maps of HSA1, and HSA7 appeared more complex 
and it was very difficult to describe disagreement between our 
description and the one of Rink et al. (2002) or the one of Lah- 
bib-Mansais et al. (2003). Nevertheless definition of conserved 
syntenic chromosomal fragments HSA1/SSC4 and HSA8/ 
SSC4 were compatible with those reported by Fujishima- 
Kanaya et al. (2003). On HSA22, our results showed a possible 
segmentation of the correspondence with SSC5 in two sub-seg¬ 
ments (upstream and downstream of the conserved syntenic 
fragment with SSC14). The first (SW152, already described 


here previously) would be new and we did not detect the second 
characterized by Lahbib-Mansais et al. (2003) and Rink et al. 
(2002) because the covering of HSA22 is not complete here. On 
HSA12 (near 12 Mb) and on HSA9 (near 80 Mb), the segmen¬ 
tation of correspondence respectively with SSC14 (only one 
marker) and SSC10 (two markers) detected here, have not been 
reported by Lahbib-Mansais et al. (2003) or by Rink et al. 
(2002). Lastly our results allowed a description of the corre¬ 
spondence between HSA8 and SSC15 (four markers) not de¬ 
tected by Lahbib-Mansais et al. (2003) or by Rink et al. 
( 2002 ). 

In summary these results increase the number of links 
between porcine maps and the human physical map. Microsa¬ 
tellites are usually considered as anonymous markers and in 
this study we demonstrate their possible integration in the com¬ 
parative map in 29 % of the cases. Flanking sequences of micro¬ 
satellites are most of the time very short and comparison of 
BAC end sequences is more effective (65%). The score ob¬ 
served with the second strategy demonstrates the interest and 
the feasibility of the sequencing of the porcine genome by a 
shotgun method using the human sequence as support. 
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Abstract. A comprehensive and comparative map was con¬ 
structed for the porcine chromosome (SSC) 6ql 1 —> q21 region, 
where the gene(s) responsible for the maldevelopment of em¬ 
bryos are localized using swine populations of the National 
Institute of Animal Industry, Japan (NIAI). Since the chromo¬ 
somal region corresponds to a region of human chromosome 
(HSA) 19ql3.1 —>ql3.3 based on bi-directional chromosome 
painting, primer pairs were designed from porcine cDNA 
sequences identified, on a sequence comparison basis, as being 
transcripts from genes orthologous to those in the HSA region. 
Fifty-one genes were successfully assigned to a swine radiation 
hybrid (RH) map with LOD scores greater than 6. ERF and 


PSMD8 genes were assigned to SSC4 and SSC1, respectively. 
The remaining 49 genes were assigned to SSC6, demonstrating 
that the synteny between the SSC6 and HSA 19 chromosomal 
regions is essentially conserved, therefore confirming, the re¬ 
sults of bi-directional chromosome painting. However, when 
examined precisely, rearrangements have apparently occurred 
within the region of conserved synteny. For the ERF and 
PSMD8 genes assigned to SSCs other than SSC6, additional 
mapping using somatic cell hybrid (SCH) panels was per¬ 
formed to confirm the results of RH-mapping. 

Copyright©2003 S. Karger AG, Basel 


The comparative genetic approach, which shifts informa¬ 
tion from the well-documented human and mouse genomes to 
less analyzed genomes of farm animals and the like, has become 
an established tool for the comprehension of genome evolution 
as well as for identification of candidate genes for economic 
traits. It has, indeed, become one of the most powerful 
approaches for the expansion and saturation of gene maps (Gel- 
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lin et al., 2000). Obviously, the significance of comparative 
genome analysis in gene mapping comes from the possibility of 
connecting the human, mouse, and rat dense gene maps with 
agricultural species maps having lower gene densities, leading 
to the improvement of those maps, which is a prerequisite for 
the identification of genes responsible for economically impor¬ 
tant traits. 

Recently, traditional mapping technologies such as genetic 
linkage analysis and fluorescent in situ hybridization (FISH) 
have been greatly supplemented by a new method, called radia¬ 
tion hybrid (RH) mapping. RH mapping has been utilized to 
resolve the order of genes/markers along the chromosome in 
human genome mapping, and is currently employed as an effi¬ 
cient technique for the generation of high-resolution gene maps 
for a wide variety of animal species, including farm animals 
and fish (Deloukas et al., 1998; Yang and Womack, 1998; Mur¬ 
phy et al., 1999; Van Etten et al., 1999; Watanabe et al., 1999; 
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Mellersh et al., 2000; Anver et al., 2001; Hukriede et al., 2001; 
Rattink et al., 2001; Chowdhary et al., 2003 ). This technique 
has been successfully incorporated into comparative mapping 
approaches to demonstrate conservation and rearrangements 
in gene order between different species. Thus, comparative 
gene mapping and RH mapping technique are increasingly 
powerful tools for the identification of candidate genes for 
traits, including economically important traits and genetic dis¬ 
eases (Womack, 1998). The ideal goal of any mapping project, 
including swine mapping, has been to create a dense gene map 
of each chromosome comparable to human, mouse and rat 
maps (Rattink et al., 2001; Kiuchi et al., 2002; McCoard et al., 
2002; Rink et al., 2002). 

Our previous study (Yasue et al., 1999) demonstrated that 
the swine resource family at the National Institute of Animal 
Industry, Japan (NIAI) (Mikawa et al., 1999) harbored one or 
more recessive genes responsible for the maldevelopment of 
embryos and they were localized in the region of swine chromo¬ 
some SSC6 q-arm, particularly, between markers RYR1 and 
SW782. The current comparative map between pig and hu¬ 
man, based on bi-directional chromosome painting, SCH map¬ 
ping and FISH (Rettenberger et al., 1995; Fronicke et al., 1996; 
Goureau et al., 1996) (http://www.toulouse.inra.fr/lgc/pig/com- 
pare/SSCHTML) indicated that the region of interest corre¬ 
sponds to the human chromosome (HSA) 19 q-arm. However, 
only 13 genes, with well-characterized human orthologs, were 
mapped on SSC6qll-*q21 (http://www.toulouse.inra.fr/lgc/ 
pig/cyto/gene/chromo/SSCG6.htm), whereas around 400 genes 
are mapped on the corresponding human region according to 
the human gene map (http://www.ncbi.nih.gov/cgi-bin/Entrez/ 
maps.cg), and their positions were not precisely determined 
along the porcine chromosome. Therefore, it is impossible to 
infer the gene correspondences between the swine and human 
chromosomal locations based on current information, which in 
turn, makes it inefficient to properly select candidate recessive 
gene(s). 

In the present study, to provide a dense map of the region 
for the selection of the candidate genes for fetus maldevelop¬ 
ment, 51 genes localized in the HSA 19 region were assigned to 
swine chromosomes by RH mapping and if necessary by 
somatic cell hybrid (SCH) mapping. 


Materials and methods 

Primer design 

The porcine expressed sequences (ESTs) registered in the database were 
clustered based on sequence similarities and aligned with human sequences 
assigned to the human physical map (http://genopole.toulouse.inra.fr/ 
Iccare/). Therefore, from the website, the porcine sequences orthologous to 
human genes with E values smaller than e-100 in the human genome region 
between ATP4 and GYS1 were selected to design primer pairs inside of puta¬ 
tive exons, which were deduced from the exon/intron structures of the corre¬ 
sponding human genes. In addition, cDNA sequences, which were generated 
in our research projects for construction of full-length cDNA libraries from 
various swine tissues (in press), were processed as above to construct primer 
pairs for genes in the region of investigation. Human genomic sequences of 
genes located in the region between ATP4 and GYS1, as well as swine 
genomic sequences corresponding to the genes, were used for primer pairs 
design. Primers were designed using Primer 3 software (http://www.ge- 
nome.wi.mit.edu/cgi-bin/primer/primer3.cgi) (Rozen and Skaletsky, 2000). 


Screening ofPCR primer pairs for RH and SCH mappings 

Primer pairs thus designed were subjected to PCR under the conditions 
described by Hawken et al. (1999) to select those that amplify only swine 
genomic DNA fragments of the expected size, or amplify both swine genomic 
DNA fragments of the expected size and Chinese hamster genomic DNAs 
fragments of sizes different from those of the swine fragments. Briefly, the 
PCR conditions for each primer pair were optimized on 25 ng and 35 ng 
genomic DNAs from pig and Chinese hamster, respectively, in 15 pi of 
10 mM Tris-HCl (pH 9.0), 1.5 mM MgCl 2 , 50 mM KC1, 200 pM of each 
dNTP, 1.0 pM of each primer pair, and 0.375 U AmpliTaq Gold (Applied 
Biosystems, CA, USA). The mixture was treated at 95 0 C for 9 min, followed 
by 35 cycles of denaturation at 95 °C for 30 sec, annealed at an appropriate 
temperature, indicated in Table 1, for 30 s, and extended at 72 °C for 30 s, 
followed by a final elongation step at 72 0 C for 5 min. All PCRs were carried 
out using the GeneAmpPCR System 9700 (Perkin-Elmer, CT, USA) with 
appropriate positive and negative controls. The PCR products were sepa¬ 
rated in a 2.0% agarose gel, stained by ethidium bromide, and visualized 
using a UV light source. To confirm that the swine fragments were the 
expected sequences, the resultant PCR products were sequenced using ABI 
PRISM BigDye Terminator Sequencing Ready Reaction Kit and a 3700 
DNA Analyzer (Applied Biosystems, CA, USA) and then compared with the 
sequences used for the primers design. The primer pairs thus verified were 
then used for the subsequent analysis. 

RH mapping 

Using each of the primer pairs thus prepared as well as the 90 selected 
IMpRH 7000-rad hybrid cell panel DNAs, which were provided by INRA 
(France) and the University of Minnesota (USA) (Yerle et al., 1998), PCR 
was performed under the conditions described above. PCR products were 
separated in a 2.0% agarose gel, and scored for the mapping as described 
previously (Kiuchi et al., 2002). Assignments of genes to the RH map were 
performed with the IMpRH mapping tool developed by Milan et al. (2000), 
which is accessible on the INRA web server (http://www.toulouse.inra.fr). 
The order of genes on the RH map was calculated based on maximum likeli¬ 
hood using Carthagene software (http://www.inra.fr/bia/T/CarthaGene/). 
The stepwise locus ordering strategy using the equal retention probability 
model was employed. This model combines minimization of the obligate 
number of breaks required to explain the observed retention patterns with 
maximum-likelihood analysis (Schiex et al., 2001). 

SCH mapping 

The somatic cell hybrid panel DNAs (Yerle et al., 1996) provided by 
INRA (France) were used for the mapping of genes indicated in Table 1. The 
PCR conditions for SCH mapping were the same as for RH mapping, except 
that 20 ng of panel DNA was used for each PCR. The assignments were 
performed with software that calculates the likelihood of the localization of a 
marker in 115 individual regions of the porcine genome on the INRA web 
server (http://www.toulouse.inra.fr/lgc/pig/pcr/pcr.htm). 


Results and discussion 

Primer pairs for RH mapping 

First, using EST sequences registered in the database and 
generated in our laboratory, primer pairs were designed for the 
sequences orthologous to 70 human genes under the conditions 
described above. Forty-eight out of 70 primer pairs (69 %) were 
found to amplify fragments of expected sizes specific to swine 
genome, all of which were examined to confirm that the 
sequences were identical to the corresponding part of the 
respective EST. 

In addition, primer pair for TGFB1 was designed from 
swine genomic sequence; and primer pairs for GYS1, PSG, 
NUCB1, and LIPE, from human genome sequences. They were 
found to amplify the swine specific fragments. Consequently, 
53 primer pairs were used for RH mapping. 
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Table 1. Description of genes mapped with IMpRH panel 


No. 

Gene 

symbol 

Gene name 

Accession 

No. 

Uni-Gene 

code 

Primer sequence (5'-3') 

Ann. 

Temp. 

PCR 

product 

2 pt LOD 

score 

RF 

2pt nearest 
marker/RF 

1 

ATP4A 

ATP4 (ATPase, H+/K+ exchanging, alpha 

NM 000703 

Hs.36992 

F: TGCAGATCAGCTCGTGGTAG 

65 

205bp 

6.04 

0.35 

SW855/0.41 



polypeptide) 



R: GCACATGGTGGAGAAGAAGG 

65 





2 

UPK1A 

uroplakin 1A 

BG609475 

Hs. 159309 

F: AACTTCACATCGGCCTTCC 

59 

139bp 

7.04 

0.33 

SW855/0.41 






R: TGGTGAACAGGTAATCCAAGTG 

59 





3 

MGC15677 

hypothetical protein MGC 15677 

AU296721 

Hs.71941 

F: GGGGGATGGTTCCTGAGC 

65 

123bp 

7.61 

0.28 

RYR/0.21 






R: CCCCAGGCAGATACTGGTT 

65 





4 

CAPNS1 

calpain, small subunit 1 

BF189081 

Hs.74451 

F: GGATCAGTCCAGTCCCTTCA 

59 

120bp 

6.24 

0.35 

RYR/0.21 






R: GGCTGAGGGCAGAATACAGA 

59 





5 

FLJ14779 

hypothetical protein FLJ14779 

BF198325 

Hs.243662 

F: AC AGGG A AT GT GGG A AG ACC 

59 

102bp 

9.17 

0.31 

RYR1/0.21 






R: AAAAGCATTCCCATACATTTCA 

59 





6 

SPINT2 

serine protease inhibitor 

BF079061 

Hs.31439 

F: GCTGTTCGTGATGGTCCTG 

65 

139bp 

12.49 

0.28 

RYR 1/0.21 






R: AGTGCTCTTCACCAGGTGCT 

65 





7 

PSMD8 

proteasomc (prosome, macropain) 26S 

AU296726 

Hs.78466 

F: GGAGTCAGCCTATATGCACCA 

57 

145bp 

8.11 

0.52 

SOI 55/0.40 



subunit, non-ATPase, 8 



R: AGACACCGGATGCTTGATGT 

57 





9 

MAP4K1 

mitogen-activated protein kinase 

B1337048 

Hs.86575 

F: GGGAAGACCCCCTACTTGTATT 

59 

lOlbp 

14.33 

0.25 

RYR 1/0.21 






R: AGCCGGTGAGGACTAATGTG 

59 





10 

SUPT5H 

supressor of Ty 5 homolog (S. cerevisiae) 

BG384760 

Hs.70186 

F: CGTTGGCTATAGTCCGATGA 

65 

182bp 

13.84 

0.25 

SW 193/0.20 






R: TGACACTTCGGATGACACCT 

65 





11 

DLL3 

delta-like 3 (Drosophila) 

BI181274 

Hs. 127792 

F: ACCCACAGCGCTTTCTTCT 

59 

194bp 

14.88 

0.24 

SW 193/0.20 






R: CTCCTGGGTCCTCATGTTGT 

59 





12 

PSMC4 

proteasomc (prosome, macropain) 26S 

AU296725 

Hs.211594 

F: TGCAGCAAGAGCTGGAGTT 

59 

171 bp 

13.84 

0.25 

SW193/0.20 



subunit, ATPase, 4 



R: CCACGATGGCTGTATTCTGA 

59 





13 

AKT2 

v-akt murine thymoma viral oncogene 

BG835263 

Hs.326445 

F: GACCATGAACGACTTCGACT 

61 

13 lbp 

16.06 

0.23 

SW193/0.20 



homolog 2 



R: TGGCGATGATAACCTCCTTC 

61 





14 

PRX 

periaxin 

BG732511 

Hs.205457 

F: AGCAGGGTACAGGGTCCAG 

61 

115bp 

14.34 

0.22 

SW193/0.20 






R: CACTGTCACAGCAGGCATCT 

61 





15 

BLVRB 

biliverdin reductase B 

BG835668 

Hs.76289 

F: GGTTATGAGGTGACGGTGCT 

61 

118bp 

13.77 

0.18 

SW 193/0.20 






R: CCACGGTCTTGTCCACATC 

61 





16 

SNRPA 

small nuclear ribonucleoprotein polypeptide 

AU296730 

Hs. 173255 

F: GGATATCATCGCCAAGATGAA 

57 

153bp 

11.8 

0.2 

SW 133/0.21 



A 



R: GGGACAGGACCCTGAACG 

57 





17 

CYP2S1 

cytochrome P450, subfamily 

BE014726 

Hs.98370 

F: CAACATCATCTGCTCCCTCA 

61 

107bp 

13.32 

0.21 

SW133/0.21 






R: GGGAGCTGACCCCCACTAC 

61 





18 

AXL 

AXL receptor tyrosine kinase 

AU296714 

Hs.83341 

F: GGAGCCTCCTGAGGTGACC 

61 

lOObp 

13.32 

0.21 

SW133/0.21 






R: GTCATCCTGCCCATCTTCG 

61 





19 

TGFB1 

transforming growth factor, beta 1 

— 

— 

F: TTGCTACTTCCTTCCACCTC 

55 

659bp 

11.15 

0.27 

SW133/0.21 






R: ATCCTCTGATGCCACAACCT 

55 





20 

BCKDHA 

branched chain keto acid dehydrogenase El, 

AU296716 

Hs.78950 

F: TGCTCACGTACCGGGACTAC 

59 

144bp 

12.4 

0.22 

SW 133/0.21 



alpha polypeptide (maple syrup urine disease) 



R: TGGCCAGTGGAGAGGAGATA 

59 





21 

PSG 

pregnancy specific beta-glycoprotein 

U09815 

— 

F: GGCTACAGCTGGTACAGAGGA 

55 

105bp 

14.23 

0.25 

SW133/0.21 






R: TCGACCACTGAATGTAGGTCC 

55 





22 

RABAC1 

Rab acceptor 1 (prenylated) 

AU296727 

Hs.11417 

F: GCCGAAACTGATTCCCTCT 

61 

187bp 

15.4 

0.24 

SW 133/0.21 






R: GCCCAGGAACACAAACACAT 

61 





23 

ERF 

Ets2 repressor factor 

AU296427 

Hs.333069 

F: CCTACAAGCCCGAGTCATCC 

61 

206bp 

9.4 

0.48 

SW286/0.32 






R: CCGGCTCAGCTTGTCATAAT 

61 





24 

CIC 

capicua homolog (Drosophila) 

AU296717 

Hs.306117 

F: AAATTCCCTAGCTCGTCTG 

57 

210bp 

13.32 

0.21 

SW 133/0.21 






R: CCGCACCTTCACCTTCTTG 

57 





25 

EGFL4 

EGF-like-domain, multiple 4 

BF441290 

Hs. 158200 

F: CGGCAGTTGCTGGTCATACT 

61 

142bp 

13.32 

0.21 

SW133/0.21 






R: CTCTGGGCTCCAGACACC 

61 





26 

LIPE 

lipase, hormone-sensitive 

NT011139 

Hs.95351 

F: TTTGAGATGCCACTGACTGC 

57 

362bp 

12.6 

0.2 

SW133/0.21 






R: CAGGCTGCTGAGCTCCTC 

57 





27 

CEACAM 1 

carcinocmbryonic antigen-related cell 

BE236127 

Hs.50964 

F: CACAATTTGGGTATTGGGTTTT 

61 

13 lbp 

16.6 

0.24 

SW133/0.21 



adhesion molecule (biliary glycoprotein) 



R: TCAGATTGGCTGACACAGGT 

61 





28 

XRCC1 

X-ray repair complemcntive defective repair 

BE012688 

Hs.98493 

F: GGTCCTTCTGGTCACCTCATC 

65 

15 lbp 

12.04 

0.26 

SW133/0.21 



in Chinese hamster cells 1 



R: CGGCTGACTGCAGACAATTT 

65 





29 

PLAUR 

plasminogen activator, urokinase receptor 

BI 119095 

Hs. 179657 

F: ATCTTCAAAAGCTGCCTCCA 

61 

132bp 

12.04 

0.26 

SW 133/0.21 






R: CTGTGGCTTCCAGACATTGA 

61 





30 

R309531 

hypothetical protein R30953 1 

BI346828 

Hs. 180943 

F: GGAGGAGGAGACCACCATC 

59 

185bp 

8.74 

0.15 

SW133/0.21 






R: CGTTGATGAGCGACGACTT 

59 





31 

KCNN4 

potassium intermcdiate/small conductance 

AY062036 3 

Hs. 10082 

F: CCTGCTCAACGTCTCCTACC 

65 

139bp 

12.4 

0.22 

SW 133/0.21 



calcium-activated channel, subfamily N, 



R: AGTGGTGAGCCAGAGACCAA 

65 







member 4 









32 

ZNF155 

zinc finger protein 155 

BI336171 

Hs.31324 

F: TT C A AGG AT GT GGCT GT GG 

65 

11 lbp 

10.4 

0.24 

SSC8E02/0.18 






R: GG AG ACC AGGTTCCTG A AGTT 

65 





33 

ZNF226 

zinc finger protein 226 

BE234711 

Hs. 145956 

F: A ACCGT AT A A AT GT GGGG AGT G 

59 

164bp 

10.79 

0.25 

SW133/0.21 






R: TCCTGTGTGAACTCGTCTGTG 

59 





34 

BCL3 

B-cell CH/lymphomas 

BI341985 

Hs.31210 

F: GTCTTCTGTCGGCATCACC 

59 

126bp 

13.26 

0.17 

SSC8E02/0.18 






R: CAAAGGGCAGGAAGGTAGGT 

59 





35 

LU 

Lutheran blood group (Anberger B antigen 

BG383073 

Hs. 155048 

F: AATAGCTCCTGGACCCAAGC 

59 

125bp 

11.68 

0.16 

SSC8E02/0.18 



included) 



R: ACTCTGCAGCAGGACGTCAG 

59 





36 

APOE 

apolipoprotein E 

AU296713 

Hs. 169401 

F: G AGC AT G A AGG AGGT G A AGG 

59 

189bp 

13.64 

0.15 

SSC8E02/0.18 






R: GTCTGGCCCAACATGTTG 

59 
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Table 1 (continued) 


No. 

Gene 

Gene name 

Accession 

Uni-Gene 

Primer sequence (5'-3') 

Ann. 

PCR 

2 pt LOD 

RF 

2pt nearest 


symbol 


No. 

code 


Temp. 

product 

score 


marker/RF 

37 

MARK4 

MAP/microtubulc affinity-regulating kinase 

BE013117 

Hs. 118843 

F: GAGCCTGAAGACTGGAGCAC 

59 

255bp 

15 

0.18 

SSC8E02/0.18 



4 (formerly MARKL1) 



R: AACAGGCTGCAGGTGAGG 

57 





38 

ERCC1 

excision repair cross-complementing rodent 

AU296719 

Hs.59544 

F: CCGCAGACCTATGCTGAGTA 

61 

145bp 

12.49 

0.24 

SSC8E02/0.18 



repair deficiency, complementation group 1 



R: GGCTCACAATGATGCTGTTG 

61 





39 

RTN2 

reticulon 2 

AU296728 

Hs.3803 

F: AGAGACCACATCGCAGGACT 

57 

252bp 

14.43 

0.22 

SSC8E02/0.18 






R: ACCGTAGGGGCAGAATCC 

57 





40 

VASP 

vasod iator-stimu latcd phosphoprotein 

BI335996 

Hs.93183 

F: CTTTGTCCAGGAGCTGAGGA 

59 

178bp 

12.49 

0.24 

SSC8E02/0.18 






R: TTCCAAGTGAGATGGGGAGT 

59 





41 

SNRPD2 

small nuclear ribonucleoprotein D2 

AU296729 

Hs.53125 

F: GCTGGAGAACGTGAAGGAGA 

65 

132bp 

13.41 

0.23 

SSC8E02/0.18 



polypeptide (16.5kD) 



R: AGGACCACGATGACCGAGT 

65 





42 

DMPK 

dystrophia myotonica-protcin kinase 

AU296718 

Hs.898 

F: AGGAGATCCTGGGCTCCAC 

61 

lOObp 

16.97 

0.2 

SSC8E02/0.18 






R: TCCTCTGATCCGGGCACT 

61 





43 

PPP5C 

protein phosphatase 5, catalytic subunit 

AU296723 

Hs.75180 

F: AC A AGGACGCC A AGAT G AAG 

59 

119bp 

9.89 

0.22 

SSC8E02/0.18 






R: CTCGATGTCGAGTGAGTCCA 

59 





44 

SLC1A5 

solute carrier family 1 (neutral amino acid 

BF444088 

Hs.183556 

F: CC AGC A AGATT GTGG AG AT G 

65 

198bp 

9.22 

0.23 

SSC8E02/0.18 



transporter), member 5 



R: ATG AGGTTCC A A AGGC AGT G 

65 





45 

NAPA 

N-ethylmaleimide-sensitive factor 

AU296722 

Hs.75932 

F: GGCGGTTTGCTGAGTACCTT 

57 

127bp 

10.4 

0.21 

S0220/0.22 



attachment protein, alpha 



R: CAAAGAGGCCCGAGAAAAAG 

55 





46 

KDELR1 

KDEL (Lys-Asp-Glu-Leu) endoplasmic 

AU296720 

Hs.78040 

F: ATTGCCTGCTCTTTCACCAC 

55 

128bp 

12.59 

0.26 

S0220/ 0.22 



reticulum protein retention receptor 1 



R: TT G ACC A AG A AT GCC AGG AT 

53 





47 

PSCD2L 

N-ethylmaleimide-sensitive factor 

AU296724 

Hs.8517 

F: CAGGCAATTTCTCTGGAGCTT 

55 

172bp 

12.59 

0.26 

S0220/0.22 



attachment protein, alpha 



R: GTTGTGGAGGCTGGTGTTC 

57 





48 

PLEKHA4 

plekstrin homology domain-containing 

BI338654 

Hs.9469 

F: TAG A AGC AGCCCTGG AATT G 

59 

142bp 

13.47 

0.24 

S0220/0.22 



family A (phosphoinositide binding 
specific) member 4 



R: TGGGTCAGATGACAAATAGTGG 

59 





49 

PPP1R15A 

protein phosphatase 1, regulatory (inhibitor) 

BE232281 

Hs.76556 

F: CTGAGAAGGTCTCCGTCCAT 

65 

155bp 

11.41 

0.16 

S0220/0.22 



subunit 15A 



R: AGCAGGGGTGAGGTAGGG 

65 





50 

NUCB1 

nucleobindin 1 

U31341 

— 

F: A A AG A ATG AGG AGG ACG AC A 

55 

402bp 

16.7 

0.2 

S0220/0.22 






R: TCTGAGTGGATGCGAGGAAC 

55 





51 

BAX 

BCL2-associatcd X protein 

AU296715 

Hs. 159426 

F: TTCATCCAGGATCGAGCAG 

55 

142bp 

17.22 

0.24 

S0220/0.22 






R: GCAGCTCCATGTTACTGTCC 

57 





52 

GYS1 

glycogen synthase 1 (muscle) 

Z33633 

— 

F: GGCCTTTCCAGAGCACTTCA 

55 

360bp 

14.04 

0.15 

S0220/0.22 






R: CCTCCTCGTCCTCATCGTAG 

55 






Assignment of genes to IMpRH map 

The IMpRH panel was typed for 53 genes. The primer pairs 
for two genes generated the fragments in all panel DNAs, so the 
results for the remaining 51 genes were scored for RH mapping. 
RH mapping revealed that 49 out of 51 genes were linked to 
first-generation markers of SSC6 (Hawken et al., 1999) with 
LOD scores greater than 6 by two-point analysis. As exempli¬ 
fied by the details of the mapping results shown in Table 1, the 
49 genes were found to be clustered in the region between 
SW855 and S0220 that corresponds to SSC6ql 1 ->q21. This 
observation indicated that almost all of the genes examined are 
syntenically conserved between HSA19 and SSC6, being con¬ 
sistent with the comparative results obtained by bi-directional 
chromosome painting (Goureau et al., 1996). The remaining 
two genes, PSMD8 and ERF, were mapped on SSC1 and SSC4, 
respectively (LOD score 8.10 and 9.12). Since the localization 
of two genes deviated from the synteny conservation between 
HSA19 and SSC6 as indicated above and from the bi-direction¬ 
al chromosome painting, SCH mapping was additionally per¬ 
formed for the two genes, supporting the results of the RH map¬ 
ping (maximal correlation 0.67 and 0.77). 

In our previous study using human genomic STSs (Kiuchi et 
al., 2002) and an earlier study using human ESTs (Lahbib- 
Mansais et al., 1999), the success rates of RH mapping using 
primer pairs designed based on human sequences were shown 
to be 1.4 and 1.9%, respectively. The success rate of RH map¬ 


ping in the present study based on swine sequences including 
ESTs was calculated to be 67 %; this value is in accordance with 
the value reported by Rink et al. (2002) using swine ESTs. Con¬ 
sequently, it is preferable to obtain swine sequences first and 
then perform PCR-based mapping such as RH mapping. 

Comparative map between swine and human chromosomes 

Forty-nine out of 51 genes in the HSA19ql3 region were 
demonstrated to be conserved in the SSC6qll-^q21 region. 
Then, in order to examine whether the arrangement of genes 
along the SSCqll—>q21 region is conserved in comparison 
with the HSA 19ql3 region, the 49 genes, together with the 
first-generation markers in the swine chromosomal region, 
were ordered using the Carthagene software for RH mapping 
(Schiex et al., 2001). The most likely order of genes including 
framework markers of the IMpRH map was calculated 
(Fig. 1A). The length between Sw492 and Swl 129 was found to 
be 747.1 centiRay (cR). 

The arrangement of the genes along SSC6 thus indicated as 
the most likely order was compared with that of HSA19 (http:// 
www.ncbi.nlm.nih.gov/mapview/, build 30) and the compara¬ 
tive map of the genes between swine and human is schematical¬ 
ly presented in Fig. 1A. In order to facilitate the discussion 
about the arrangement of genes on SSC6 in comparison with 
human, the chromosomal region was operationally divided 
into five blocks, as shown in Fig. 1A. 
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Fig. 1. Comparative gene map between SSC6qll->q21 and 
HSA19ql3.1 —> q 13.3. (A) IMpRH mapping results were schematically pre¬ 
sented in comparison with the gene arrangement in HSA19q-arm. Assuming 
that 1 cR in the IMpRH map of SSC6ql 1 —>q21 corresponds to 23 kb, the 
gene order and distance between genes were drawn with the same magnifica¬ 
tion as for the human gene map described on the right-hand side. The hori¬ 
zontal bars show correspondence between swine genes and their human 
counterparts. The genes mapped are operationally grouped into five blocks to 
facilitate the discussion of gene arrangement (see text). 
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Fig- 1- B In order to graphically express the relationship of gene order in 
the above swine and human chromosome regions, genes were numbered in 
the order from centromere toward telomere of HSA19q-arm as shown in 
Table 1, and arranged in the abscissa based on the genes number. Likewise, 
genes were numbered in the order from centromere toward telomere of 
SSC6q, and arranged along the ordinate. Then the cross-point of the abscissa 
and ordinate in terms of each gene was marked by a closed square. The two 
genes, EFR and PSMD8, marked by zero on the ordinate represent the trans¬ 
location to chromosome regions other than the SSC6q-arm. 


Block 1. Generally, a conserved order of loci was found 
within the pig and human chromosome segments containing 
ATP4, UPK1, CAPNS1, MGC15677, FLJ14779, SPINT2, 
RYR1, and MAP4K1 with the two inversions MGC 15677/ 
CAPNS1 and RYR1/MAP4K1. The study of Martins-Wess et 
al. (2002) on the BAC 1.2-Mb contig of the RYR1-region dem¬ 
onstrated that the contig contained PSMD8, which is in con¬ 
trast to the present observation that PSMD8 was assigned to 
SSC8 consistently using RH and SCH approaches. To clarify 
this point, additional experiments such as FISH and sequenc¬ 
ing of the BAC contig are required. 

Block 2. The order of SUPT5H, PSMC4, AKT2, and 
SNRPA was indicated to be the same as in humans; and the 
order of BLVRB, PRX, and DLL3, was found to be inverse in 
comparison with the human order with insertions of those 
genes into the arrangement of SUPT5H, PSMC4, AKT2, and 
SNRPA. 

Block 3. CYP2S1, AXL, CIC, EGFL4, and LIPE were 
placed in one position of the IMpRH map as well as XRCC1 
and PLAUR. The resolution of the IMpRH did not allow the 
determination of the positional relationships of the genes in the 
blocks, suggesting that those genes in the blocks reside in close 
proximity. This could be further examined by mapping with a 
higher resolution RH panel, IMNpRH2, 12,000 rads, (Yerle et 
al., 2002) and/or by construction of a BAC contig for the seg¬ 
ment. The spacing between CYP2S1 and AXL, and between 
CIC, EGFL4 and LIPE genes conformed to the human chromo¬ 
some segment (http://www.ncbi.nlm.nih.gov/mapview/maps. 
cgi?org=hum&chr =19). It should be pointed out that the seg¬ 
ment consisting of three genes (CIC, EGFL4, and LIPE) 
appears to be translocated in pigs. The segment containing 
TGFB1, PSG, and BCKDHA was found to be inversely 
oriented in the swine genome in comparison with the human 
genome. 


Block 4. The two genes, ZNF155 and ZNF226 in Block 4A 
were inversely oriented between the swine and human chromo¬ 
somes. The order of BCL3, LU, APOE, and MARKL1 in Block 
4B was conserved, with an inverted orientation between pig 
and human. The segment containing ERCC1, RTN2, VASP, 
SNRPD2, and DMPK, shown as Block 4C, was indicated to 
undergo intrachromosomal translocation. When the inside of 
Block 4C was examined, all the genes except RTN2 were found 
to be arranged in the same order as in human. 

Block 5. As shown in Fig. 1A and IB, the locus order in the 
porcine chromosome segment, including the genes PPP5C, 
SLC1A5, NAPA, KDELR1, PLEKHA4, PPP1R15A, NUCB1, 
BAX, and GYS1 differed from those in human. 

In order to provide information on the possible insertion or 
deletion of chromosomal fragments in elucidating the conser¬ 
vation of gene arrangements, the length of the swine chromo¬ 
some region should be expressed in a common and definite 
scale as base pairs (bp). The correspondences between cR and 
bp in the IMpRH map have been reported in several studies as 
follows. In the pioneer studies of Hawken et al. (1999), 1 cR of 
IMpRH7000 map was estimated to be 70 kbp on average, rang¬ 
ing from 50.4 kb for SSC3 to 116 kb for SSC18. In subsequent 
publications, Genet et al. (2001) determined the mean value of 
the physical distance per cR for the IMpRH map as 14.6 kb. In 
the study of Hiraiwa et al. (2001), the physical distances per cR 
varied from 5.2 kb per cR in TCRA/TCRD to 0.3 kb per cR in 
TCRB regions. Recently, Kiuchi et al. (2002) reported the vari¬ 
ation of distance per cR using BAC-ends, which ranged from 
8 kb/cR (SSC12) to more than 126 kb/cR (SSC18). Taking into 
account the findings outlined above, it is likely that the value of 
the physical distance per cR depends on the analyzed genome 
regions. Therefore, the calculation for kb/cR of the nearest 
region thus reported, 23 kb/cR for clone 387B4 linked to Sw709 
(SSC6) (Kiuchi et al., 2002), was employed for the estimation of 
distance between Sw492 and Swll29; the estimation showed 
that the distance comprised 17.2 Mb. 

Distances between genes in SSC6 are shown in Fig. 1A with 
those in HSA19 as reference. For example, the distances 
between ATP4A and RYR1, and between RYR1 and GYS1 in 
the swine chromosome were calculated to be 2.5 and 9.6 Mb, 
respectively, whereas the distances are 2.5 and 10 Mb, respec¬ 
tively, in human. Based on the calculation, the length of the 
Blocks 1 through 4 appeared to be approximately the same 
between swine and human. The calculation, however, indi¬ 
cated a difference in the length of Block 5 between swine and 
human, which contains rearrangements between homologous 
chromosomal segments. If the calculation is the case, insertion 
(or deletion) of chromosomal segments would have occurred in 
the process of rearrangements during the species evolution. 

Finally, with respect to the selection of candidate genes 
responsible for the maldevelopment of embryos in the NIAI 
family (Yasue et al., 1999), the genes (PSG, CEACAM1, 
TGFB1, PSMC4, CYP2S1, and VASP) whose functions direct¬ 
ly or indirectly influence the development of embryos (Bonyadi 
et al., 1997; Wu et al., 1999; Sakao et al., 2000; Han et al., 
2001), are of special interest. For example, VASP is expressed 
in a cell-specific and trimester-specific manner in the human 
placenta, and is expected to be important for blastocyst implan- 
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tation and placental development (Kayisli et al, 2002). In a 
forthcoming study, we will examine the expression and se¬ 
quence variation of these genes in the affected embryo, in order 
to identify the gene responsible for the maldevelopment. 
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Abstract. We generated a sequence-ready BAC/PAC contig 
spanning approximately 5.5 Mb on porcine chromosome 
6ql.2, which represents a very gene-rich genome region. STS 
content mapping was used as the main strategy for the assembly 
of the contig and a total of 6 microsatellite markers, 53 gene- 
related STS and 116 STS corresponding to BAC and PAC end 
sequences were analyzed. The contig comprises 316 BAC and 
PAC clones covering the region between the genes GPI and 
LIPE. The correct contig assembly was verified by RH-map- 
ping of STS markers and comparative mapping of BAC/PAC 
end sequences using BLAST searches. The use of microsatellite 
primer pairs allowed the integration of the physical maps with 
the genetic map of this region. Comparative mapping of the 


porcine BAC/PAC contig with respect to the gene-rich region 
on the human chromosome 19q 13.1 map revealed a completely 
conserved gene order of this segment, however, physical dis¬ 
tances differ somewhat between HSA19ql3.1 and SSC6ql.2. 
Three major differences in DNA content between human and 
pig are found in two large intergenic regions and in one region 
of a clustered gene family, respectively. While there is a com¬ 
plete conservation of gene order between pig and human, the 
comparative analysis with respect to the rodent species mouse 
and rat shows one breakpoint where a genome segment is 
inverted. 

Copyright©2003 S. Karger AG, Basel 


The pig has recently joined the organisms that are consid¬ 
ered high scientific priority for genome sequencing by the 
NHGRI (http://www.genome.gov/10002154; Rohrer et al., 
2002). For the correct assembly of whole genome shotgun 
sequences highly accurate physical maps will be indispensable. 
High-resolution physical maps of specific genome regions are 
also required for the localization, isolation and characterization 
of genes, including those involved in the development of spe- 
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cific diseases or QTLs. Since the human genome project is near¬ 
ly finished, comparative mapping approaches using the human 
information greatly facilitate the construction of physical maps 
in other mammalian species. In the pig, the first comparative 
maps have been determined at low resolution by chromosome 
painting experiments that demonstrated conservation of large 
chromosome fragments between human and pig (Goureau et 
al., 1996). Increased resolution of the comparative maps was 
achieved with the use of defined anchor probes in FISH experi¬ 
ments (Pinton et al., 2000). Recently, medium-resolution glob¬ 
al comparative maps were provided by RH-mapping (Rink et 
al., 2002). Detailed and high-resolution comparative maps of 
specific pig genome regions require clone-based approaches 
and have rarely been reported in the past, e.g. the comparative 
analysis of the porcine RN gene region on SSC15q25 with its 
syntenic counterpart on HSA2q35 (Jeon et al., 2001; Robic et 
al., 2001). Another very recent comparative analysis of several 
genome regions in multiple mammalian species also included 
the pig and described methods for increasing the throughput of 
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the time-consuming clone-based mapping strategies (Thomas 
et al., 2002). 

We chose the gene-rich region on porcine chromosome 
6ql.2 to generate high-resolution clone-based physical maps 
that are integrated with RH and genetic maps as this region 
contains at least one gene of large economic importance to 
swine production (RYR1) and possibly other QTLs related to 
muscle growth (Fujii et ah, 1991; Bidanel et al., 2001). We have 
previously reported the initial construction and analysis of a 
1.2-Mb BAC/PAC contig in this region (Martins-Wess et al., 

2002) and the use of different resolution RH-panels in the anal¬ 
ysis and quality control of this contig (Martins-Wess et al., 

2003) . 

In the present study, we report the extension of this BAC/ 
PAC contig to 5.5 Mb and its analysis and integration with oth¬ 
er existing maps. The resulting high-resolution integrated map 
should facilitate the assembly of large-scale sequencing data of 
this region and help to define positional candidate genes for 
QTLs in this region. 


Materials and methods 

DNA library screening and chromosome walking 

Library screenings were done as described (Martins-Wess et al., 2002). 
Briefly, the TAIGP714 PAC library (Al-Bayati et al., 1999; http://www.rzpd. 
de/) was screened by PCR of hierarchical DNA pools. The porcine genomic 
BAC library RPCI-44 (Fahrenkrug et al., 2001) was screened by radioactive 
hybridization according to the RPCI protocol (http://www.chori.org/bacpac/). 

BAC and PAC DNA was prepared from 100-ml overnight cultures using 
the Qiagen Midi plasmid kit according to the modified protocol for BACs 
(Qiagen, Hilden, Germany). Insert sizes were determined as described (Mar¬ 
tins-Wess and Leeb, 2003). A detailed list of the primer sequences used for 
library screenings can be found in the supplementary information. 

DNA sequence analysis 

BAC and PAC end sequences were determined using T7 and SP6 prim¬ 
ers and the thermosequenase kit (Amersham Biosciences, Freiburg, Germa¬ 
ny) on a LICOR 4200L automated sequencer. Sequence data were analyzed 
with Sequencher 4.0.5 (GeneCodes, Ann Arbor, MI). The sequence analysis 
of one complete PAC clone of this contig has been described elsewhere (Dro- 
gemiiller and Leeb, 2002). Further analyses were performed with the online 
tools of the European Bioinformatics Institute (http://www.ebi.ac.uk/), 
BLAST database searches against the human genome and nr sections of the 
GenBank database of the National Center for Biotechnology Information 
NCBI (http://www.ncbi.nlm.nih.gov/) and the RepeatMasker searching tool 
for repetitive elements (Smit, A.F.A. and Green, P., http://repeatmasker.ge- 
nome.washington.edu/). Single copy sequences were used to design primer 
pairs for the chromosome walking using the programs GeneFisher and 
Primer3 (http://bibiserv.techfak.uni-bielefeld.de/cgi-bin/gf_submit?mode= 
START, http://www-genome.wi.mit.edu/cgi-bin/primer/primer3_www.cgi). 

Somatic cell hybrid and RH mapping 

STS markers were tested on a somatic cell hybrid panel (Yerle et al., 
1996) or mapped on the IMpRH (Yerle et al., 1998) and IMNpRH2 panels 
(Yerle et al., 2003) according to the INRA protocols (http://www.tou- 
louse.inra.fr/lgc/lgc.htm). After genotyping the IMpRH panel the chromo¬ 
some assignments were performed with software available on the INRA 
WWW server and submitted to the IMpRH database (http://imprh.tou- 
louse.inra.fr; Milan et al., 2000). 

Genetic mapping of microsatellite markers 

Microsatellite markers were genotyped across seven families of the 
USDA Meat Animal Research Center Swine Reference Population as 
described (Rohrer et al., 1994). Initial localization of each marker was based 
on TWOPOINT analyses (CRIMAP 2.4) against all markers previously map¬ 


ped in the population. Specific locations within the linkage group were deter¬ 
mined with the ALL option. FLIPS was then ran to determine if the current 
order was the most likely, and then the results from CHROMPIC were used 
to identify suspect genotypes. All suspect genotypes were evaluated and cor¬ 
rections were made. 

Results and discussion 

Construction of the BAC and PAC Contig 

The construction of the BAC/PAC contig was initiated by 
screening the TAIGP714 PAC library or the RPCI-44 BAC 
library with probes derived from human HSA19ql3.1 genes. 
Further probes that allowed the gradual joining of the individu¬ 
al emerging contigs into one large contig were generated from 
the end sequences of isolated clones. In the process of the contig 
construction 175 region-specific STS markers of SSC 6ql.2 
were generated including 53 gene-associated STS markers 
(see supplementary table, www.karger.com/doi/10.1159/ 
000075735). All STS markers were either tested on a somatic 
cell hybrid panel or on the 7,000-rad IMpRH panel to confirm 
the correct chromosomal localization on SSC6. In addition to 
this, the 12,000-rad IMNpRH2 panel was used to verify the 
order and orientation of subcontigs prior to major assembly 
steps. The complete BAC/PAC contig consisted of 316 clones 
(223 BACs and 93 PACs, see supplementary figure, www.kar- 
ger.com/doi/10.1159/000075735). The entire contig spans ap¬ 
proximately 5.5 Mb and can be covered with a minimal tiling 
path of 45 clones. 

Physical mapping and gene order 

537 end sequences from the BAC/PAC clones of the contig 
were generated and submitted to the EMBL database under 
accessions AJ514457-AJ514832, AJ560805-AJ561041 and 
AJ561044-AJ561089. All these BAC and PAC end sequences 
were subjected to BLAST searches against the human genome 
sequences. Of the 537 sequences 52 (9.7%) had significant (e- 
value < 10 -5 ) and unique matches against genomic sequences of 
HSA19q 13.1. These matches corresponded well with the overall 
clone order in the porcine BAC/PAC contig and confirmed the 
correct assembly. The physical mapping information derived 
from the contig assembly was refined by taking into account 
BAC/PAC insert sizes and clone overlap sizes, which were deter¬ 
mined by EcoRl fingerprinting. The clone-based physical map 
was anchored to the RH map of the pig genome by analyzing STS 
markers from the contig on the IMpRH panel. In total 20 evenly 
spaced RH results were submitted to the IMpRH porcine RH 
mapping database (http://imprh.toulouse.inra.fr/). 

During construction of the porcine contig three techniques 
were used to localize porcine genes on the contig: (1) Heterolo¬ 
gous primers were designed from the human gene sequences in 
order to localize the porcine orthologs by STS content mapping 
on the contig. In cases where orthologous porcine EST informa¬ 
tion was available for HSA19ql3.1 genes, the genomic struc¬ 
ture of the porcine gene was inferred from the human ortholog 
and primers for genomic PCR were designed from the porcine 
EST sequence. (2) Alternatively, some genes were localized by 
hybridization of human cDNA probes to membranes with 
ordered sets of BAC/PAC DNAs. (3) Finally, in some cases the 
BLAST searches with BAC/PAC end sequences revealed the 
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experimental evidence in SSC6 physical map: 

Finished DNA sequence of complete BAC/PAC clone 
STS marker 
BLAST result 

BLAST result and STS marker 

BLAST result, STS marker and cDNA hybridization result 

cDNA hybridization result 
Microsatellite 
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presence of exons within these end sequences. Thus the corre¬ 
sponding genes could be localized on the contig in silico. Using 
these three approaches 58 genes could be assigned to the BAC/ 
PAC contig. 

Integration of the physical map with the genetic map 

STS content analysis of all clones of the BAC/PAC contig 
revealed that two previously described porcine microsatellites 
are located on the contig (HAL, SW193). The precise physical 
assignment of these microsatellites will benefit future QTL 
analyses with these markers, as their position with respect to 
coding genes is now available. 

Additionally, several novel microsatellites were found dur¬ 
ing this study in the BAC/PAC end sequences. Flanking PCR 
primers for four of these microsatellites (L105_MS, 
NEUD4MS, PAK4_MS, CD79A_MS) were designed and the 
microsatellites were genotyped on seven litters (86 progeny) of 
the MARC swine reference population. The markers were 
included in the linkage map of SSC6 (Fig. 1). The genetic map¬ 
ping of all six markers is consistent with the physical mapping. 
The recombination frequency cannot be reliably estimated in 
the investigated region as there were only two recombinations 
observed between the two bins of markers with an average of 86 
informative meioses for the six physically mapped microsatel¬ 
lites. A more reliable recombination frequency will be esti¬ 
mated if one takes the interval between the markers SW1067 
and SW193 that are separated by 4.6 cM on the genetic map 
and by 255 CR 7000 on the latest RH framework map (unpub¬ 
lished data). Considering a ratio of 20 kb/cR 7 ooo (Martins-Wess 
et al., 2003) this would mean that SW1067 and SW193 are sep¬ 
arated by roughly 5.1 Mb and that the recombination frequen¬ 
cy would be 0.9 cM/Mb. This intermediate value for the recom¬ 
bination frequency seems reasonable considering on one hand 
that the investigated region is located close to the centromere, 
where low recombination frequencies have to be expected and 
on the other hand that the region is GC- and gene-rich, where 
elevated recombination frequencies are expected. 


Fig. 1. Comparative mapping of 58 gene-associated markers that were 
used in this study in Sus scrofa (results from this study), Homo sapiens (NCBI 
build 33), Mus musculus (NCBI build 30) and Rattus norvegicus (NCBI 15- 
April-2003). The experimental evidence that led to the assignment of the 
porcine genes is indicated by the color of the marker. The gene symbol NYD- 
SP11 is an interim symbol used by NCBI as no official symbol has been 
approved yet for this gene. For one porcine STS marker in the region of the 
CYP2B gene cluster the exact porcine orthologous gene name could not be 
determined, therefore this marker was termed “CYP2B??”. Note that there is 
complete conservation of the gene order across the entire 5.5-Mb interval 
between pig and human. The two rodent species show an inversion of the 
genes Arhgefl - Lipe (red line) with respect to the segment Gpi - Bckdha. The 
bold gene names in the rat map denote computer predicted genes, which are 
similar either to a murine or to a human gene. The porcine physical mapping 
data about the gene order on SSC 6ql.2 were also integrated with the genetic 
map of this region. Physically assigned microsatellites are shown in orange 
next to the gene map. Detailed information on the microsatellite markers 
can be found at http://www.marc.usda.gov/. Superscripts denote gene-asso¬ 
ciated markers that have been used for RH mapping on 1 the IMpRH, 2 the 
IMNpRH2, 3 both the IMpRH and the IMNpRH2 panel, respectively. 


Comparative analysis 

The gene order (Fig. 1) of the 58 localized genes on 
SSC6ql.2 corresponds exactly to the gene order of the NCBI 
HSA19 map (http://www.ncbi.nlm.nih.gov; build 33), which 
lists 285 gene loci in the interval between GPI and LIPE. Of 
these 285 gene loci 136 represent computer predicted hypothe¬ 
tical genes while 149 have at least some experimental evidence. 
Interestingly, the physical size of the investigated region is 
much smaller in pig than in human (5.5 vs. 8.1 Mb). Three 
major differences between pig and human account for a large 
fraction of this deviation in DNA content. Two of these three 
sites are located in regions, where there are extended non-cod¬ 
ing regions in the human genome. In the human genome the 
distance between GPI and USF2 is about 600 kb larger than in 
pig and the distance in the gene-poor region between COX7A1 
and NYD-SP11 is roughly 300 kb larger in human than in pig. 
Another significant difference of about 400 kb is found in the 
region flanked by the BCKDHA and ARHGEF1 genes. In this 
region the human genome harbors a clustered family of six car- 
cinoembryonic antigen-related cell adhesion molecule genes 
(CE AC AM 3-CE AC AM 7, CEACAMP3). The reduced size of 
the pig genome could be an indication that this gene family is 
smaller in pig than in human 

High overall gene order conservation can also be observed 
with respect to the mouse and to the rat. The current NCBI 
map of murine chromosome 7 (build 30) shows an identical 
albeit inverted order of the genes Gpi - Bckdha with respect to 
pig or human. In contrast, in the mouse the segment between 
the genes Arhgefl - Lipe is parallel to the human and pig. This 
indicates an internal rearrangement within a block of con¬ 
served synteny. The same phenomenon can be observed on the 
rat chromosome 1, which shows complete conservation to 
MMU 7 in this region. 

Recent RH data in cattle demonstrate that parts of BTA18 
are homologous to the investigated region of this study (Gold- 
ammer et al., 2002). However, the proposed gene order in cattle 
(UBA52 - GPI - COX6B - CAPNS1 - POLR2I - GMFG - 
BLVRB - MIA - PAFAH1B3 - LIPE - BCKDHA - AXL) is 
not in good agreement with the other mammalian species indi¬ 
cating several chromosomal rearrangements in cattle and/or 
inaccuracies of the bovine RH map. 

Conclusions 

Our report provides a detailed high-resolution map and 
clone-contig of a 5.5 Mb region on SSC6ql.2, which represents 
one of the most gene-rich regions of the pig genome. The map¬ 
ping information will facilitate the accurate assembly of whole 
genome shotgun DNA sequences of this region during the 
upcoming pig genome project. Furthermore, the integration of 
physical mapping information with the genetic map allows the 
rational selection of positional candidate genes for QTLs that 
have been mapped to this genome region. 
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Supplementary Figure 

Physical map of the isolated BAC/PAC contig on SSC6ql.2. STS mark¬ 
ers are indicated vertically on the top and by dotted vertical lines, cDNA 
hybridization probes are represented as horizontal lines at the top, markers 
that are associated with genes are denoted in bold. The physical sizes of the 
hybridization probes depend on the intron sizes of their respective genomic 
targets. BACs and PACs are indicated as horizontal lines below the markers 
with their corresponding clone names and insert sizes. For the RPCI-44 BAC 
clones the full names are given while for the TAIGP714 PAC abbreviated 
clone names are given. For example the official clone designation (http:// 


www.rzpd.de/) TAIGP714G05113 was abbreviated to 714G05113. A mini¬ 
mal tiling path of 45 clones is indicated in bold. The complete sequence of 
the clone 714M17141 has been submitted to the public database under acces¬ 
sion AJ410870. 


Supplementary Table 

Primer sequences of all STS markers used (www.karger.com/doi/ 
10.1159/000075735). 
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Abstract. ZOO-FISH mapping shows human chromosomes 
1, 9 and 10 share regions of homology with pig chromosome 10 
(SSC10). A more refined comparative map of SSC10 has been 
developed to help identify positional candidate genes for QTL 
on SSC10 from human genome sequence. Genes from relevant 
chromosomal regions of the public human genome sequence 
were used to BLAST porcine EST databases. Primers were 
designed from the matching porcine ESTs to assign them to 
porcine chromosomes using the INRA somatic cell hybrid 
panel (INRA-SCHP) and the INRA-University of Minnesota 
Radiation Hybrid Panel (IMpRH). Twenty-eight genes from 


HSA1, 9 and 10 were physically mapped: fifteen to SSC10 
(ACOl, ATP5C1, BMI1, CYB5R1, DCTN3, DNAJA1, 
EPHX1, GALT, GDI2, HSPC177, OPRS1, NUDT2, PHYH, 
RGS2, VIM), eleven to SSC1 (ADFP, ALDHIB1, CLTA, 
CMG1, HARC, PLAA, STOML2, RRP40, TESK1, VCP and 
VLDLR) and two to SSC4 (ALDH9A1 and TNRC4). Two ano¬ 
nymous markers were also physically mapped to SSC10 
(SWR1849 and S0070) to better connect the physical and link¬ 
age maps. These assignments have further refined the compara¬ 
tive map between SSC1, 4 and 10 and HSA1, 9 and 10. 

Copyright©2003 S. Karger AG, Basel 


Comparative maps can help identify candidates for genes 
underlying QTL in species with sparse genome resources. Such 
maps enable pig mappers to take advantage of the vast amount 
of human sequence and annotation generated by the human 
genome project. Documentation of evolutionary breaks be¬ 
tween pig and the human genomes since divergence from a 
common ancestor permits recognition of conserved blocks of 
genes. This helps to predict the location and gene order of 
uncharacterised genes in the pig (Shi et al., 2001). 

Comparative maps determined by ZOO-FISH (Fronicke et 
al., 1996; Goureau et al., 1996; Rettenberger et al, 1995) pro¬ 
vide a low-resolution view of chromosomal relationships be¬ 
tween pig and human. ZOO-FISH has revealed that most of the 
long arm, SSC10ql2->q27, corresponds to HSAlOp. However 
SSClOp, corresponding to an interstitial region of a human 
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chromosome, HSAlq41->q42, and SSC10cen->ql2, corre¬ 
sponding to an interstitial region HSA9pl3, are not precisely 
defined. 

Uncharacterised porcine ESTs that are homologous to genes 
from relevant chromosomal regions of the human genome 
sequence can be physically mapped using a somatic cell hybrid 
panel (Yerle et al., 1996) or more accurately with a high-resolu¬ 
tion radiation panel (Yerle et al., 1998) to greatly improve the 
resolution of the comparative maps. The limiting factor to this 
process currently is availability of porcine ESTs. Ultimately the 
objective is to identify evolutionary breakpoints at the level of 
adjacent gene loci in the genome sequence. 

Materials and methods 

Primer design 

Primers were designed for genes located on human chromosomes 1, 9 
and 10, where ZOO-FISH suggested the orthologues would be found on pig 
chromosome 10 (SSC10). 

Initially human mRNA sequence from genes individually identified in 
the area of interest were used in a nucleotide Basic Local Alignment Search 
Tool (BLASTn Version 2, 2.3) (http://www.ncbi.nlm.nih.gov/BLAST/) (Alt- 
shul et al., 1990) to query the GenBank (http://www.ncbi.nlm.nih.gov) EST 
database and identify matching ESTs from pigs for primer design. Human 
genome sequence was used to predict the position and approximate size of 
introns. Since intron size is not strongly conserved between species, incorpo¬ 
ration of an intron (<1,500 bp) was designed to help distinguish pig and 
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rodent PCR products if both amplified in the physical mapping panels. Later 
a more systematic approach to surveying for evolutionary breakpoints was 
adopted using human genome sequence contigs downloaded from GenBank 
along with their complete annotation. A BLASTN search using the contigs 
against the porcine EST database was performed in a local personal comput¬ 
er with the very large volume of results filtered using a program (written by 
Y. Chen) designed to find highly matching porcine ESTs covering more than 
one exon and flanking an intron predicted from the human sequence. Rele¬ 
vant annotation, such as the name of the human orthologues and the size of 
the human intron, was output and primers were then designed to flank the 
predicted intron as above. The Primer3 program (Rozen and Skaletsky, 
2000) was implemented in ANGIS’s Biomanager (www.angis.org.au) for 
primer design. 

Finally one consensus primer was designed based on conserved sequence 
identified by alignment using (ClustalW [fast]) (Thomson et al., 1994) in 
ANGIS’s Biomanager of orthologous mRNA from mouse and human. Prim¬ 
ers were designed as above to flank an intron predicted from the human gene 
sequence. 

Evaluation of primers 

Primers were evaluated on pig DNA and control mouse and hamster 
genomic DNA templates provided with the INRA SCHP (Yerle et al., 1996). 
The PCR reactions were carried out in a 25-pl volume containing 40 ng of 
genomic DNA (mouse, hamster or pig), lx Taq DNA polymerase buffer, 
0.8 pM of each primer, 1.5 mM MgCl 2 and 100 pM dNTPs covered with two 
drops of paraffin oil. After denaturation at 95 0 C for ten min and holding at 
80 ° C, 1 U of Taq polymerase was added. For PCR products expected to be 
less than 500 bp, touchdown program 1 was used (44 cycles 95 °C 40 s, 63- 
55 °C 60 s, 72 °C 60 s, 1 cycle 72 °C 20 min), and for those greater than 
500 bp, program 2 (44 cycles 95 °C 40s, 63-55 °C 60 s, 72 °C 1 min 30 s, 1 
cycle 72 °C 20 min). When mouse or hamster products of identical size co¬ 
amplified with porcine products, restriction enzyme digestion was used, 
where possible, to discriminate between pig and rodent PCR products. 

Somatic cell hybrid panel and regional assignments 

10 ng (2 pi) of DNA from the 27 pig/rodent (Chinese hamster 1-19 and 
mouse 20-27) hybrid clones of the INRA-SCHP (Yerle et al., 1996) was used 
as a template for PCR. Primers for two anonymous pig markers (SWR1849 
and S0070) were also used on the INRA-SCHP. The PCR protocol was as 
above. Relevant restriction digestion was used to discriminate identical sized 
porcine and rodent PCR products when they co-amplified from the panel. 
Assignments were made using software available on the WWW INRA server 
(http://www.toulouse.inra.fr/lgc/pig/hybrid.htm) (Chevalet et al., 1997). 

High-resolution radiation hybrid mapping 

Genes were then mapped on the high-resolution INRA-University of 
Minnesota Radiation Hybrid Panel (IMpRH) (Yerle et al., 1998). PCR was 
performed on the 90-clone IMpRH panel using 20 ng of DNA from each 
hybrid clone and on control DNA from Chinese hamster and pig using the 
previously described PCR protocols. Results were analysed with the IMpRH 
mapping tool (Milan et al., 2000) via the IMpRH web-sever (http://www.tou- 
louse.inra.fr/lgc/pig/RH/Menuchr.htm) to map the genes in relation to pre¬ 
viously localised markers on the first generation IMpRH map (Hawken et al., 
1999). Gene order was assessed using RHMAP 3.0 (Boehnke et al., 1996). 
Linkage groups were detected using RH2PT and the most likely order of 
genes was then determined using RHMAXLIK following similar procedures 
to Rink et al. (2002). Maps were drawn using Genetic MapCreator (http:// 
www.wesbarris.com/mapcreator/). 

Results 

Porcine primers and DNA amplification 
Twenty-eight of thirty-six primers amplified a porcine prod¬ 
uct in PCR. Only interim gene names and symbols were avail¬ 
able for CMG1, CYB5R1, HSPC177, HARC and RRP40. All 
other ESTs have been named according to the official Homo 
sapiens gene symbol and name nomenclature (http://www.gene. 
ucl. ac. uk/nomenclature/). 


Table 1. Primer sequences and expected PCR product sizes 


Locus 

Primer sequences (5—3’) 

(forward and reverse) 

Estimated size of 
PCR product (bp) 

ALDH9A1 

AC AT CT GG AGG A AT CCTTG 
CCCTCCTTTAATTTGGGATCTT 

400 

TNRC4 

CGGCT CAC AGT GCCT ACC 
CGACCCCGTTGGGATACAGA 

400 a 

GALT 

GG AAGGAACGTCT GGT CCT AAC 
TCAGCTCAGGTAGCCGCC 

400 

CYB5R1 

AAGTTCCGCTTTGCCCTG 

CGCT GGT GAT GGGAGT G 

700 

RGS2 

GAAAAAGAAGCCCCAACAGA 

CCAAGAAACGGGGATAAGA 

500 

EPHX1 

CTTCAAGGCGGAGACTGG 

C AGG A A A A AGGT C AGGGT GT AG 

700 b 

ACOl 

CGGGGGCATCCTCAACTAC 

T GCT G ATTTT GT GGGC AG AG 

250 

DCTN3 

AC AT C A A AGCCGTT CCT GAG 

C AGCCT CT AGCT GGC AA AGT 

350 

DNAJA1 

GGACCATACAGCTGGTTGAGG 

T GCCTT C ATTT AGC AC AC AC 

500 

NUDT2 

CT GGCT GGCGGAAGTAAA 
TGAAGGTGGGCTCGTCTC 

250 

OPRS1 

ACC AT CAT CT CT GGC ACCTT 
CTCCACCATCCATGTGTTTG 

1200 

S0070 

GGCG AGC ATTT C ATT CAC AG 
GAGCAAACAGCATCGTGAGC 

300 

SWR1849 

CCTGTTCTGCCTCTAGCCTG 

CT GAG AAGCCT GT GCAT CAG 

160 

HSPC177 

GGCCATCACTTTAACCAACC 

ACT C ATT GCTT GGAG A AGGC 

400 

ATP5C1 

GCGGCTCTGGAGTAGTGAAA 

TT CT GACAAGGACAGCAGAAAG 

600 

BMI1 

T GCCCT CTCTGT AAT GT CT GG 

T GCAGACTT CGGACAAT GAA 

300 

GDI2 

GACCAAGGAACCCGAGAAG 

TCCCCGT AG AT GT CGTTTTT 

400 

PHYH 

GCCTCGT GAA AGG AG A A AG A 
CCCACTGCAAAATTACCCATAC 

250 

VIM 

AAT G A A ACTTCCC AGC AT CAC 

C AC AAGGCTT CTTCGGT AA AC 

300 

ADFP 

GCT CC ATT CC ACT GT C A ACC 

T GCTTCTCTT CCATTCCACC 

1700 

ALDH1B1 

CAGAAGGAAGGGGCAAAAC 

T AA AGCCT C C GAA AGGT GTG 

400 

CLTA 

CCGGCT GT GT GACTTTAACC 

CT CT G A AT GCC AGGG AG A AC 

500 b 

RRP40 

AACGG A AT GGGT GT GATT G 

GTTT GCT G AAT GGTTTTT GCT 

1300 

HARC 

T CGAT GGGAT GAT AGT CAGAGA 

CT GC ATT ACG AC AGCTT GAT G 

900 

CMG1 

T GAAAT GGAAGAAGTAAT GAAT GA 

T CC AT ACTTT GCGTTT CTCG 

1500 

PL A A 

T GGCT C ATCT GGT GCT AAT C 
TGTTTCCAAGTCCCAACATC 

1200 c 

STOML2 

GGG ACT GG AGCCCTTTT G 

CT GCT GGGGCACAAATAGTA 

400 d 

TESK1 

GCTACCGCAACCTGAACTG 

AGGGGTT GG AA AGG A A AT GT 

300 

VCP 

C AGAGC AACCCAT C AGC AC 

CAGCC AGG AGGT CC AT AGAA 

400 

VLDLR 

T GT G A AG AT GGCTCT GAT GAA 

TCCAGGACACTGGGATACAC 

1600 


a Neil digestion required to distinguish porcine and rodent PCR products. 
b Cfol digestion required to distinguish porcine and rodent PCR products. 
c Haelll digestion required to distinguish porcine and rodent PCR products. 
d Ddel digestion required to distinguish porcine and rodent PCR products. 
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Table 2. Localisation of twenty-eight genes and two anonymous markers on the INRA-SCHP and/or IMpRH 


Locus 

Human 

Porcine EST 

Localisation in pigs on SCHP b 

Localisation on IMpRH c 




Location 

(Human mRNA) 
GenBank acc. no. 

Porcine 

location 

Regional 

probability 

Correlation 

coefficient 

2pt nearest marker 
(Chromosome) 

Distance 

(cR) 

2pt LOD 
Score 

Retention 

frequency 

ALDH9A1 (aldehyde dehydrogenase 9 
family, member Al) 

lq23.1 

BI336843 
(NT 004668) 

— 

— 

— 

SW1364 (4) 

41 

7.88 

0.41 

TNRC4 (trinucleotide repeat containing 4) 

1 q21 

B4441154 
(NT 021907) 

— 

— 

— 

SW512 (4) 

18 

13.21 

0.23 

GALT (galactose-1-phosphate 
uridylytransferase) 

9pl3 

Consensus primer 




SSC25A02 (10) 

40 

9.75 

0.4 

CYB5R1 (cytochrome b5 reductase 1 
(B5R.1) 

1 p36.13—q41 

BE236092 
(NT 034409) 

— 

— 

— 

SWC19 (10) 

49 

6.42 

0.24 

RGS2 (regulator of G-protein signalling 2, 
24 kDa) 

lq31 

BF711291 
(NT 004671) 




SW830 (10) 

58 

5.36 

0.31 

EPHX1 (epoxide hydrolase 1, microsomal 
[xenobiotic]) 

lq42.1 

AB000883 
(NT 004525) 

1 Op 16-p 11 

0.9982 

0.8575 

SW1894 (10) 

34 

6.11 

0.34 

ACO1 (aconitase 1, soluble) 

9p22 - pi3 

Z84039 
(NM 002197) 

lOql1—q12 

0.7422 

0.7826 

SSC25A02 (10) 

31 

10.54 

0.31 

DCTN3 (dynactin 3) 

9p 13 

BF711908 
(NM_007234) 

lOql1—ql2 

0.7431 

0.846 

SSC25A02 (10) 

28 

8.27 

0.28 

DNAJA1 (DnaJ [HSP40] homolog, 
subfamily A, member 1) 

9p13-p12 

BI184750 
(NM 001539) 

lOql1—q12 

0.8977 

0.9177 

SSC25A02 (10) 

25 

9.12 

0.25 

NUDT2 (nudix [nucleoside diphosphate 
linked moiety X]-typc motif 2) 

9p 13 

AW482597 
(NMOO1161) 

lOql1—q12 

0.7423 

0.7255 

— 

— 

— 

— 

OPRS1 (opioid receptor, sigma 1) 

9pl 1.2 

BI3999929 
(NM 005866) 

lOql1—ql2 

0.7431 

0.846 

SSC25A02 (10) 

25 

9.12 

0.25 

S0070 (anonymous marker) 

— 

— 

lOql1—q12 

0.7431 

0.846 

SSC10GO7 (10) 

41 

5.13 

0.41 

SWR1849 (anonymous marker) 

— 

— 

lOql1—q12 

0.8091 

0.8224 

SSC10GO7 (10) 

44 

7.46 

0.34 

HSPC177 (hypothetical protein) 

9pl2 

BF712766 
(NM_016410) 

lOql1—ql2 

0.9757 

0.85 

SSC25A02 (10) 

36 

11.91 

0.36 

ATP5C1 (ATP synthase, H+ transporting, 
mitochondrial F1 complex, gamma 
polypeptide 1) 

10q22-q23 

BI399612 
(NM 005174) 

1 Oq 17 

0.89 

1 





BMI1 (B lymphoma Mo-MLV insertion 
region [mouse]) 

1 Op 13 

BI404608 
(NM 005180) 

1 Oq 17 

0.8794 

0.9282 

SW1829 (10) 

43 

11.16 

0.43 

GDI2 (GDP dissociation inhibitor 2) 

1 Op 15 

Z84059 
(NM 001494) 

1 Oq 17 

0.8794 

0.9282 

SWR67 (10) 

37 

4.05 

0.37 

PHYH (phytanoyl-CoA hydroxylase 
[Rcfsum disease]) 

lOpter- 
pl 1.2 

BF711970 
(NM 006214) 

1 Oq 17 

0.8794 

0.9282 

SW2043 (10) 

46 

18.15 

0.46 

VIM (vimentin) 

1 Op 13 

BF701886 
(NM 003380) 

1 Oq 17 

0.89 

l 

SW1991 (10) 

42 

6.77 

0.42 

ADFP (Adipose differentiation related 
protein) 

9p21 

AU055652 
(NM 001122) 

1q23-q27 

0.7754 

0.8058 





ALDH1B1 (aldehyde dehydrogenase 1 
family, member B1) 

9p 13 

BI118972 
(NM 000692) 

Iq23-q27 

0.8575 

l 

SW2551 (1) 

30 

9.68 

0.3 

CLTA (clathrin, light polypeptide [Lea]) 

9p 13 

BI 182475 
(NM 001833) 

Iq23-q27 

0.8573 

0.8748 

SW2551 (1) 

27 

14.4 

0.27 

RRP40 (exsome component Rrp40) 

9pl 1 

BI344574 
(NM_016042) 

Iq23-q27 

0.8573 

0.8748 

SW2551 (1) 

18 

6.64 

0.18 

HARC (Hsp90 associating relative of CDC 
37) 

9p24.1 

BG383220 

(NM017913) 

Iq23-q27 

0.8575 

l 

SW1462 (1) 

25 

4.23 

0.25 

CMG1 (capillary morphogenesis protein 1) 

9pl3.3 

BF444460 

(NT_023974) 

1q23-q27 

0.8754 

0.8919 

SW216 (1) 

25 

10.35 

0.25 

PLAA (phospholipase A2-activating 
protein) 

9p21 

AW619064 
(NM 004253) 

Iq23-q27 

0.839 

0.7416 

SW216 (1) 

18 

7.71 

0.18 

STOML2 (stomatin (EPB72)-like 2) 

9p 13.1 

BF192094 
(NM_013442) 

Iq23-q27 

0.8575 

1 

SW2551 (1) 

23 

11.47 

0.23 

TESK1 (testis-specific kinase 1) 

9pl3 

BG835134 
(NM 006285) 

Iq23-q27 

0.8572 

0.8748 

SW803 (1) 

24 

11.75 

0.24 

VCP (Vasolin-containing protein) 

9p13—p12 

BI 181457 
(NM 007126) 

Iq23-q27 

0.8573 

0.8748 

SW2551 (1) 

27 

7.25 

0.27 

VLDLR (Very low density lipoprotein 
receptor) 

9p24 

BF702786 

(NM_003383) 

1q23-q27 

0.857 

0.7545 






a The cytogenetic location was obtained from LocusLink at NCBI ( http://www.ncbi.nlm.nih.gov/LocusLink/) . 
b SCHP: The localisations on the SCHP gives the regional assignment with statistical scores. 

IMpRH analysis: The closest marker obtained by two-point analysis is given with the bearing chromosome in parentheses, also the distance (cR), retention frequencies and 
LOD score values. 
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Fig. 1 . Localisation of genes onto the SSC10 IMpRH map. Genes mapped are in bold and are compared with their comparative 
human chromosome and position (http://www.ncbi.nlm.nih.gov/mapview/maps.cgi?&query). The IMpRH map is compared with 
the SSC 10 cytogenetic and linkage map (http://abcenter.coafes.umn.edu/RHmaps/chromosome/chromosomelO.html). 


Twenty-two had an unambiguously recognisable porcine- 
specific product (ACO1, ADFP, ALDH9A1, ALDH1B1, 
ATP5C1, BMI1, DCTN3, DNAJA1, GDI2, RRP40, 
HSPC177, CYB5R1, HARC, CMG1, NUDT2, OPRS1, 
PHYH, RGS2, TESK1, VCP, VIM, and VLDLR). Six had 
identical sized products in the hamster, mouse and pig (CLTA, 
GALT, EPHX1, PLAA, STOML2, and TNRC4). Table 1 doc¬ 
uments the primers, predicted product sizes and the restriction 
enzyme digestion, where needed, for discriminating porcine 
and rodent PCR products. GALT primers only were designed 
from mouse/human consensus sequence with all others de¬ 
signed from porcine ESTs. 

SCHP mapping 

Twenty-three genes as well as two anonymous markers 
(S0070 and SWR1849) were regionally assigned as shown in 


Table 2. The most probable chromosomal assignment of each 
locus is listed together with the calculated statistical scores. 

IMpRH mapping 

Twenty-four genes and two anonymous markers were more 
precisely mapped using the IMpRH as shown in Table 2. Those 
genes mapped to SSC 10 are shown in Fig. 1 and those to SSC1 
in Fig. 2. CYB5R1, GALT, TNRC4, RGS2 and ALDH9A1 
were mapped only using the IMpRH. ADFP, ATP5C1 and 
VLDLR could not be mapped on this resource. The most likely 
gene order determined using RHMAP 3.0 (Boehnke et al., 
1996) for the six loci linked to SSC25A02 is GALT- 
HSPC177-ACO1 -DNAJA1 -DCTN3-OPRS1, which is 100 
times more likely than the next best order where HSPC177 and 
ACOl are reversed. The gene order was also determined for 
genes linked to SW2551 are VCP-STOML2-CLTA. 
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Fig. 2. Localisation of genes onto the SSC1 IMpRH map. Genes mapped are in bold and are com¬ 
pared with human chromosome 9 and their position in Mb (http://www.ncbi.nlm.nih.gov/mapview/ 
maps.cgi?&query). The IMpRH map is compared with the SSC1 cytogenetic and linkage map (http:// 
abcenter.coafes.umn.edu/RHmaps/chromosome/chromosome 1 .html). 


Discussion 

Twenty-eight genes and two anonymous markers were phys¬ 
ically mapped using either the INRA-SCHP and/or the 
IMpRH. The INRA-SCHP was used to assign markers to chro¬ 
mosomes after which the IMpRH was used for high-resolution 
analysis of chromosomal position. 

Three genes from HSA1 were localised to the p arm of 
SSC 10. EPHX1 was shown to be linked to SW1894 by two- 
point analysis using the IMpRH, after its localisation onto 
SSC10pl6->pl 1 with the INRA-SCHP. CYB5R1 and RGS2 


were directly mapped on the IMpRH. Their approximate loca¬ 
tion on the IMpRH map can be seen in Fig. 1. RGS2 and 
CYB5R1 are from a syntenic block on HSA1 that has been 
found to map to SSC 10, while EPHX1 is located on a different 
conserved syntenic block further down HSA1. However in the 
pig, EPHX1 is located between RGS2 and CYB5R1, implying 
internal rearrangements with respect to HSA1 and SSC 10. 

ACOl, DCTN3, DNAJA1, HSPC177 and OPRS1 were all 
found to be linked to the marker SSC25A02 using the IMpRH, 
confirming their localisation to SSC 1 Oq 11 —> q 12 by the INRA- 
SCHP. The ACOl (IREB1) position on SSClOql 1 —>q 12 con- 
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firmed the published position (Wintero et ah, 1998). NUDT2 
(AP4A) was assigned to the region SSClOql 1->ql2 by the 
INRA-SCHP, confirming the study by Jorgensen et al. (1997). 
The previously mapped porcine EST SSC25A02 (Hawken et 
al., 1999) is homologous to NUDT2, so there was no need to 
further map this gene on the IMpRH. GALT was linked to 
SSC10G07 on SSC10 using the IMpRH, contradicting a pub¬ 
lished position (Pinton et al., 2000) on SSC14 based on heterol¬ 
ogous FISH using a goat bacterial artificial chromosome (BAC) 
on porcine chromosomes but agreeing with a recent report 
(Wimmers et al., 2002) mapping it adjacent to SSC10G07. 
The gene order GALT-HSPC177-ACO1 -DNAJA1 -DCTN3- 
OPRS1 is rearranged with respect to the human. 

VLDLR, HARC, ADFP, PLAA, CMG1, VCP, STOML2, 
TESK1, CLTA, RRP40 and ALDH1B1 from HSA9p mapped to 
SSClq23-^q27 using the INRA-SCHP, but on HSA9 enclosed 
the bracket of loci described in the previous paragraph (~ 2 Mb), 
which mapped to SSC 10. Mapping of ADFP to SSClq23->q27 
confirms published results (Davoli et al., 2002), as does the 
mapping of VLDLR to SSClq23->q27 (Pinton et al., 2000). 
The gene order in humans is HARC-PLAA-CMG1-(block 
mapped to SSC10)-VCP-STOML2-TESK1-CLTA-RRP40. 
These genes were further mapped using the IMpRH to deter¬ 
mine whether gene order is conserved. The gene order seems to 
be conserved between HSA9 and SSC 1, with the similar order of 
genes shown in Fig. 2. HARC was mapped by two-point analysis 
to be closest to SW1462, but multipoint analysis placed it 
between SW970 and SWR337. This may be due to its low LOD 
score of 4.23. The multipoint results for this marker, using the 
IMpRH mapping tool, are more consistent with the human gene 
order, so this position has been shown in Fig. 2. The mapping of 
more genes from the same region on HSA9 would confirm 
whether human gene order is conserved in pig. 

Five loci from HSA10 were localised to SSClOq, as pre¬ 
dicted from the ZOO-FISH comparative map. VIM, PHYH, 
GDI2, BMI1, and ATP5C1 were mapped on SSClOq 17, with 
the assignment of VIM to SSClOq 17 confirming the published 
assignment (Tosser-Klopp et al., 1998), the localisation of 
ATP5C1 confirms the result of Nonneman and Rohrer (2003) 
who mapped it by linkage analysis adjacent to the marker 
SWR67. The other three map assignments are novel. To more 
precisely determine the location of the genes from HSA10, 
BMI1, GDI2, PHYH and VIM were also mapped on the 
IMpRH. GDI2 was mapped with a low LOD score of 4.05. The 
gene order (GDI2-PHYH-VIM-BMI1) does not seem to be 
totally conserved between the pig and human, so some internal 
rearrangement may have occurred. IMpRH assignment of 
genes to SSClOq revealed a systematic problem with the SCHP 
assignments to this chromosome where all loci mapped to 
SSClOq 17 on the SCHP. It appears that an artefactual negative 
result from hybrid number 15 in the SCHP causes loci to be 
assigned to SSClOq 17 rather than further up the chromosome. 
The IMpRH showed the loci distributed over a broader range 
from SSC10ql3—>ql7. 

Two genes from HSA1 were directly mapped using the 
IMpRH, and were found to be located on SSC4. TNRC4 and 
ALDH9A1 were found to be linked to SW512 and SW1364 by 
two-point analysis. 


The anonymous markers S0070 and SWR1849 were found 
to be located either side of SSC10G07. The mapping of these 
markers will help to further correlate the linkage map and the 
physical map. 

Identifying evolutionary breakpoints: Human evolutionary 

breakpoints relative to SSC 10 (HSA1, 9 & 10) 

The limited sequence data for the pig means that the refine¬ 
ment of human evolutionary breakpoints relative to SSC 10 is 
dependent on more precise mapping of genes in pig. Mapping 
using the IMpRH allows the distance between genes flanking 
the breakpoint to be determined in cR, which can be converted 
into base pairs. 

The position of the first evolutionary break point with 
respect to SSC 10 was localised between CYB5R1, from HSA1, 
and GALT, from HSA9 (Fig. 1). However these loci are not yet 
statistically “connected” on the radiation hybrid map and there 
is thus no centiRay (cR) or nucleotide estimate of the degree of 
imprecision of the position of the evolutionary breakpoint. 

The second human evolutionary breakpoint on SSC 10 lies 
between OPRS1 from HSA9 and VIM from HSA10, about 
100 cR apart (Fig. 1). Assuming 1 cR equals about 69.8 kbp 
(Hawken et al., 1999), the breakpoint is localised to an approxi¬ 
mately seven million base pairs interval. More loci will have to 
be mapped to refine this position. 

Mapping porcine breakpoints relative to HSA9 (SSC1, 

SSC 10 and SSC1) 

The precise mapping of porcine genes is not required for the 
identification of porcine evolutionary breakpoints relative to 
the human genome sequence. Evolutionary breaks between loci 
in the human sequence can be identified even if the porcine loci 
are mapped only at chromosomal precision. Thus porcine evo¬ 
lutionary breaks relative to human gene order can be deter¬ 
mined, down to adjacent locus precision in theory, even with 
the imprecise INRA-SCHP mapping methodology. 

An evolutionary breakpoint lies between CMG1 and ACOl 
on HSA9 in an interval of about five million base pairs, but this 
could not be refined further as there were no more porcine 
ESTs in the area. The next evolutionary breakpoint between 
GALT and VCP was refined to a region of about 400,000 bp. 

Success rate of mapping 

In addition to the genes described here, unsuccessful at¬ 
tempts were made to map another eight genes using the same 
approach. The mapping failed for a variety of reasons, includ¬ 
ing failure to amplify a single distinct PCR product or band 
sizes identical in hamster, mouse and pig with no differentiat¬ 
ing enzyme. The success rate from primers designed to genes 
mapped was 78%. 
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Abstract. Conserved segments have been identified by 
ZOO-FISH between pig chromosome 9 (SSC9) and human 
chromosomes 1, 7 and 11. To assist in the identification of 
positional candidate genes for QTL on SSC9, the comparative 
map was further developed. Primers were designed from por¬ 
cine EST sequence homologous to genes in regions of human 
chromosomes 1, 7 and 11. Porcine ESTs were then physically 
assigned using the INRA somatic cell hybrid panel (INRA- 


SCHP) and the high-resolution radiation hybrid panel 
(IMpRH). Seventeen genes (PEPP3, RAB7L1, FNBP2, MAP- 
KAPK2, GNAI1, ABCB1, STEAP, AKAP9, CYP51A1, SGCE, 
ROB04, SIAT4C, GLUL, CACNA1E, PTGS2, Clorfl6 and 
ETS1) were mapped to SSC9, while GUSB, CPSF4 and THG-1 
were assigned to SSC3. 

Copyright©2003 S. Karger AG, Basel 


The development of detailed comparative maps assists in 
the identification of candidate genes responsible for mapped 
QTL. This comparative positional candidate gene approach 
makes use of the wealth of information generated by the human 
genome project. Comparative maps between SSC9 and human 
chromosomes 1, 7 and 11 have been developed by heterologous 
chromosome painting (Rettenberger et al., 1995; Fronicke et 
al., 1996; and Goureau et al., 1996). However, the painting 
resolution is limited with the borders of conserved segments 
often not well defined. Consequently, ZOO-FISH data alone is 
insufficient in providing the necessary information required for 
identifying comparative positional candidate genes. 

To improve the comparative map of porcine chromosome 
9, porcine EST sequence homologous to genes from human 
chromosomes 1, 7 and 11 were physically mapped using a 
somatic cell hybrid panel (Yerle et al., 1996) and a high-resolu- 


Supported by Australian Pork Limited (APL) grant 1756 to C. Moran. 
Received 28 May 2003; manuscript accepted 30 July 2003. 

Request reprints from Chris Moran 

Centre for Advanced Technologies in Animal Genetics and Reproduction 
Faculty of Veterinary Science, R.M.C Gunn Building (B19) 

University of Sydney, NSW 2006 (Australia) 
telephone: +61 2 9351 3553; fax: +61 2 9351 2114 
e-mail: Chris.Moran@vetsci.usyd.edu.au 


tion radiation panel (Yerle et al., 1998). The regions of human 
chromosomes used to refine the comparative map were chosen 
based on chromosome painting. 

Materials and methods 

Gene selection and primer design 

Human cDNA sequences (http://www.ncbi.nlm.nih.gov/LocusLink/) 
were obtained from human chromosomal regions suspected to map to SSC9. 
To identify homologous pig ESTs, a standard nucleotide-nucleotide 
BLASTN (Basic Local Alignment Search Tool - Nucleotide) on the GenBank 
non-mouse and non-human EST database was performed (Altschul et al., 
1997). The position and approximate size of introns within the porcine ESTs 
were inferred from the human homologues. The program Primer3 (Rozen 
and Skaletsky, 2000; http://www.angis.org.au) was used to design PCR prim¬ 
er pairs from porcine sequence separated by a predicted intron, ranging in 
size from 200 to 1,600 bp. 

The primer design process was automated using a computer program 
written by Dr Yizhou Chen. A contig from the human genome sequence in a 
region of interest (e.g. near an evolutionary breakpoint) was downloaded 
along with the GenBank annotations and a BLASTN performed against por¬ 
cine ESTs. 

PCR amplification 

PCR reactions were carried out in a 25-pl reaction mixture overlaid with 
paraffin oil. The mixture contained 20 ng of template DNA, 1.0 mM MgCl 2 , 
100 pM dNTP, 1 x Taq DNA Polymerase buffer, 0.8 pM of each oligonucleo¬ 
tide primer and 0.5 U Taq polymerase. The PCR reaction mixture for MAP- 
KAPK2 and FNBP2 contained 1.5 mM MgCl 2 . 

Amplifications were performed using hot start and touchdown PCR con¬ 
ditions on a PTC-100 (MJ Research Inc.) thermocycler. The samples were 
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Table 1. Primer sequences and expected PCR product sizes 


Locus 

Primer sequences (5' - 3') 

(forward and reverse) 

Estimated size of 
PCR product (bp) 

ABCB1 

GCCTCGTATCTTGCTTCTGG 

AAGTCTGCGTTCTGGATGGT 

800 

AKAP9 

TGCT G AAG AACCC A ATTT CTG 

AG A AGC A AG AGCT AG AACG AG A AG 

400 

CACNA1E 

AGGCTGTCTTCGACTGTGTG 

GCCTTTCACCTCCATCTTGT 

700 

C1 orfl 6 

CC A A AGG AG ACC AT CT G ACC 
CCACTTGGTCTTCACCTCATC 

500 

CPSF4 

CTTCCTGCACATCGACCC 

AG AT G ACT CT CCGCGT GT G 

500 

CYP51A1 

C AAT CC AGAAACGC AGAC AA 

CGCC A AG AGC AGT CCA AT A 

600 

ETS1 

TCCCGCTATACCTCGGATTA 

GG AT GGAGGGT CT GGT AGG 

1000 

FNBP2 

GCATCCCT GCT GCTTTACC 

CT GGACCTCT CCACGAC AC 

1600 a 

GLUL 

C ACCCCT GGTTT GG AAT G 
GCCATAGGCTTTGTCTGCTC 

400 c 

GNAI1 

CGGCCATCATCTTCTGTGT 

GCTGTCGAACAACTTCATGC 

1700 

GUSB 

CCTCATCAACGGGAAACCT 

CGTAGGGGTAGTGGCTGGT 

400 

MAPKAPK2 

CCCCCTTCTATTCCAACCA 

T GGGT GGGCT CT GT CTT C 

500 

PEPP3 

GACCCTGCTGCCTATGTAATG 

GGGG AT GGT GGGT AGTT GT 

600 a 

PTGS2 

CGGAAT GGGACGAT GAAC 
GCACTCTGGGTCAAACTTC 

500 

RAB7L1 

GGCTTCACAGGTTGGACAG 

CCTT GGGT GGAC AAAGAC A 

400 

ROB04 

CCCCCGACATTCACTACCT 

CCCACACTCCAGAATCACCT 

500 

SGCE 

GGAAAAGAG AAAC AT GC AAAC AC 

T GTC AC GG AGTT CTTT GGT AG A 

800 

SIAT4C 

GGCTTTCGTCCTGGTGGT 

GGCATGGCTCCTTCTTCTCT 

500 b 

STEAP 

GC AGTTT GGGCTT CT C AGTT 

GCT CAAT CC AGGC AT CTTCT 

700 

THG-1 

AAGCCTCGTTGGCATTGAC 

CTCCTTCAGCACCTCCACCT 

500 


a Haelll digestion required to distinguish porcine and rodent PCR products. 
b Neil digestion required to distinguish porcine and rodent PCR products. 
c Pvull digestion required to distinguish porcine and rodent PCR products. 


denatured at 95 °C for 10 min, held at 80 °C during addition of Taq poly¬ 
merase, then touchdown PCR was run. Two PCR programs were used, pro¬ 
gram A (44 cycles 95 ° C 40 s, 63-55 ° C 60 s, 72 0 C 60 s, 1 cycle 72 0 C 20 min) 
and program B (44 cycles 95 ° C 40 s, 63-55 0 C 60 s, 72 ° C 1 min 30 s, 1 cycle 
72 ° C 20 min). Program B was used for the amplification of larger PCR prod¬ 
ucts (above 800 bp). 

Each primer pair was evaluated initially using genomic DNA from pig, 
mouse and hamster. Primers that produced a distinct PCR product of 
approximately the expected size from porcine genomic DNA were selected 
for mapping. In cases when the porcine amplification product was similar in 
size to a rodent fragment, Neil, Kpn\, Haelll, Pstl, Xbal, Alul, Pvull and 
Hinfl (Promega) digestions were carried out, in an attempt to discriminate 
between porcine and rodent fragments. Primer sequences and expected PCR 
product sizes are listed in Table 1. 

PCR products (15 pi) and restriction digests (25 pi) were electrophoresed 
in 2 % agarose gels and visualised by ethidium bromide staining under UV. 
Gels were photographed with a Kodak DC-5000 digital camera. 

Regional assignments with a somatic cell hybrid panel 

Regional assignments were obtained using the INRA pig-rodent (hamster 
or mouse) somatic cell hybrid panel comprising 27 hybrid clones (Yerle et al., 
1996). After scoring for the presence or absence of a porcine-specific frag¬ 
ment (of the same size as the porcine control) in each hybrid, the results were 


analysed using software available on the WWW INRA server (http://www. 
toulouse.inra.fr/lgc/pig/pcr/pcr.htm) that implements the statistical rules de¬ 
fined by Chevalet et al.(1997). 

Radiation hybrid Mapping 

The 90 clone IMpRH panel (Yerle et al., 1998) was used for high-resolu¬ 
tion gene mapping along with porcine and hamster control DNAs. Porcine 
PCR products were scored either as present, absent or ambiguous for each 
hybrid. Ambiguous data points were re-amplified and any remaining dis¬ 
crepancies scored as ambiguous. These results were analysed, using the 
IMpRH mapping tool (http://imprh.toulouse.inra.fr/) developed by Milan et 
al. (2000), relative to markers previously localised on the IMpRH panel. 
Gene order was assessed using RHMAP 3.0 (Boehnke et al., 1996). Linkage 
groups were detected using RH2PT and the most likely order of genes was 
then determined using RHMAXLIK following similar procedures to Rink et 
al. (2002). Maps were drawn using Genetic MapCreator (http://www.wesbar- 
ris.com/mapcreator/). 

Results 

Twenty primer pairs amplified a porcine product in PCR. 
Table 2 shows the genes that were physically mapped using the 
INRA-SCHP and IMpRH panel and the location of the human 
homologues. Only interim gene names and symbols were avail¬ 
able for PEPP3 and THG-1. All other ESTs have been named 
according to the official Homo sapiens gene symbol and name 
nomenclature (http://www.gene.ucl.ac.uk/nomenclature/). 

Regional assignments with a somatic cell hybrid panel 

Fourteen loci were regionally assigned using the porcine 
INRA-SCHP (Table 2). Statistical analysis of five genes indi¬ 
cated two regions of one chromosome with equal probability, 
requiring resolution by IMpRH mapping. All assignments were 
associated with an error risk of less than 0.1 %. 

Radiation hybrid mapping 

Genes located on SSC9 were accurately mapped using the 
IMpRH. Results for fourteen genes can be seen in Table 2 and 
are shown in Fig. 1. GLUL, CACNA1E, PTGS2, Clorfl6, 
PEPP3 and SIAT4C were mapped only on the IMpRH. The 
most likely gene order determined using RHMAP 3.0 (Boehnke 
et al., 1996) for the four loci linked to SW174 is CACNA1E- 
GLUL-Clorfl6-PTGS2. For the three loci located between 
SWR250 and SO 109, the most likely order is ROB04- 
SIAT4C-ETS1. 

Discussion 

HSA1 homologies with SSC9 

Porcine ESTs homologous to the genes RAB7F1, FNBP2 
and MAPKAPK2 located on HSAlq32 were physically map¬ 
ped using the INRA-SCHP to SSC9ql2-*(l/3q21). These 
regional assignments differ from the results obtained by chro¬ 
mosome painting that suggests that HSAlq31 ->qter is homol¬ 
ogous to SSC9q23 -> qter (Goureau et al., 1996). The most like¬ 
ly explanation is that these genes are part of a small conserved 
segment that has remained undetected due to the low resolu¬ 
tion of ZOO-FISH. 

RAB7F1, PEPP3 and MAPKAPK2 were localised using the 
IMpRH (Fig. 1). The PEPP3 gene was mapped adjacent to the 
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Table 2. Localisation of twenty genes on the INRA-SCHP and/or IMpRH 


Locus 


Human 

location 3 


Porcine EST Porcine location 
(human mRNA) 

GenBank acc. 
no. 


Localisation in pigs on Localisation on IMpRH c 
SCHP b 

Statistical score 2pt nearest Distance 2pt LOD Retention 

marker (cR) score frequency 

Regional Correlation 
probability coefficient 


ABCB1 (ATP-binding cassette, sub-family B 
(MDR/TAP), member 1) 

7q21.1 

AW316438 
(NT 007933) 

9ql 1 

9(2/3q21)-q26 

0.4969 

0.4969 

0.9282 

0.9282 

“ 

" 

" 

“ 

AKAP9 (A kinase (PRKA) anchor protein 
(yotiao) 9) 

7q21-q22 

BI182527 
(NT 007933) 

9q 12—(l/3q21) 

0.9757 

0.9270 

SWR915 

17 

14.49 

0.44 

CACNA1E (calcium channel, voltage- 
dependent, alpha IE subunit) 

Iq25-q31 

AW414865 
(NT_004487) 

— 

— 

— 

SW174 

50 

6.43 

0.24 

C1 orf 16 (chromosome 1 open reading frame 
16) 

lq35 

BG732841 

(NT_004487) 

— 

— 

— 

SW174 

39 

8.36 

0.28 

CPSF4 (cleavage and polyadenylation 
specific factor 4, 30 kDa subunit) 

7q22.1 

BG895380 
(NT 007933) 

3p17—p16 

3pl 1 

0.4414 

0.4414 

0.9270 

0.9270 

— 

— 

— 

— 

CYP51A1 (cytochrome P450, family 51, 
subfamily A, polypeptide 1) 

7q21.2—q21.3 

AB009988 
(NT 007933) 

9q 12—(l/3q21) 

0.9757 

0.9270 

— 

— 

— 

— 

ETS1 (v-ets erythroblastosis virus E26 
oncogene homolog 1 (avian)) 

1 lq23.3 

AW785077 
(NT 033899) 

9pl 1—(l/2pl 3) 

0.8986 

0.9282 

SO 109 

22 

11.68 

0.31 

FNBP2 (formin binding protein 2) 

lq32.1 

BI119288 
(NT_021877) 

9q 12—(1 /3q21) 

0.9759 

0.8619 

— 

— 

— 

— 

GLUL (glutamate-ammonia ligase 
(glutamine synthase)) 

1 q31 

Z29636 

(NT_004487) 




SW174 

68 

4.28 

0.21 

GNAI1 (guanine nucleotide binding protein 
(G protein), alpha inhibiting activity 
polypeptide 1) 

7q21 

U11249 
(NT 007933) 

9ql 1 

9(2/3 q21)-q26 

0.4969 

0.4969 

1.0000 

1.0000 

SW866 

12 

16.69 

0.35 

GUSB (glucuronidase, beta) 

7q21.11 

BG833981 

(NT_007758) 

3p17—p16 

3pl 1 

0.4414 

0.4414 

0.9270 

0.9270 

— 

— 


— 

MAPKAPK2 (mitogen-activated protein 
kinase-activated protein kinase 2) 

lq32 

BF077776 

(NT_021877) 

9q 12—(1 /3q21) 

0.9757 

0.9270 

SWR1939 

15 

15.82 

0.43 

PEPP3 (phosphoinositol 3-phosphate- 
binding protein-3) 

lq32.1 

BI342376 
(NT 034410) 

— 

— 

— 

S0095 

25 

11.59 

0.5 

PTGS2 (prostaglandin-endoperoxide 
synthase 2 (prostaglandin G/H synthase and 
cyclooxygenase)) 

Iq25.2-q25.3 

AF207824 
(NT 004487) 




SW174 

22 

12.67 

0.3 

RAB7L1 (RAB7, member RAS oncogene 
family-like 1) 

lq32 

AW436737 
(NT 034410) 

9q 12—(l/3q21) 

0.9759 

0.9282 

SOI 19 

8 

18.85 

0.45 

ROB04 (roundabout homolog 4, magic 
roundabout (Drosophila)) 

1 lq24.2 

BE233526 
(NT 033899) 

9pl 1—(l/2p 13) 

0.8179 

0.9286 

SWR250 

16 

15.71 

0.42 

SGCE (sarcoglycan, epsilon) 

7q21-q22 

BQ601987 
(NT 007933) 

9q 12—(l/3q21) 

0.9757 

0.9270 

SWR915 

59 

5.21 

0.34 

SIAT4C (sialyltransferase 4C (beta- 
galactosidase alpha-2,3-sialytransferase)) 

Ilq23-q24 

BI185489 
(NT 033899) 

— 

— 

— 

S0109 

18 

13.95 

0.37 

STEAP (six transmembrane epithelial 
antigen of the prostate) 

7q21 

AF319659 
(NT 007933) 

9ql2-(l/3q21) 

0.9757 

0.9270 

SSC8B04 

48 

6.91 

0.32 

THG-1 (TSC-22-like) 

7p21—p15 

BE014616 
(NT 007933) 

3p17—p16 

3pl 1 

0.4414 

0.4414 

0.9270 

0.9270 






3 The cytogenetic location was obtained from LocusLink at NCBI (http://www.ncbi.nlm.nih.gov/LocusLink/). 
b SCHP: The localisations on the SCHP gives the regional assignment with statistical scores. 

c IMpRH analysis: The closest marker obtained by two-point analysis is given also the distance (cR), retention frequencies and LOD score values. 


marker S0095 on the SSC9 IMpRH map. S0095 has been phys¬ 
ically assigned by in situ hybridisation to SSC9pl 1 —>ql 1 (El- 
legren, 1993). Therefore, the INRA-SCHP and IMpRH map¬ 
ping results agree that a block of HSA1 is conserved on the 
proximal part of SSC9q. The IMpRH panel assigned the genes 
on SSC9 in the order RAB7L1-PEPP3-MAPKAPK2 in con¬ 
trast to the human gene order of PEPP3-RAB7L1-MAP- 
KAPK2 (http://www.ncbi.nlm.nih.gov/mapview/map_search. 

cgi). 

The human genome sequence shows that the gene MYBPH 
is located 1.1 Mb (http://www.ensembl.org/) above PEPP3 on 
HSA1. The porcine homologue of MYBPH has been assigned 
to SSC9q23 by FISH (Sun et al., 2002). The genes GLUL, 
CACNA1E, Clorfl6 and PTGS2 are located above MYBPH 


on HSA1 (http://www.ncbi.nlm.nih.gov/mapview/map_search. 
cgi) and were mapped adjacent to the marker SW174 on SSC9 
using the IMpRH panel (Fig. 1). GLUL was previously mapped 
near this marker by linkage analysis and on the IMpRH in a 
study by Stratil et al. (2002). Sun et al. (2002) previously map¬ 
ped PTGS2 to SSC9q23 by FISH. These genes therefore repre¬ 
sent the larger segment of HSA1, homologous to SSC9q23-> 
qter, detected by chromosome painting (Goureau et al., 1996). 
Thus the 1.1-Mb interval between MYBPH and PEPP3 is the 
site of a breakpoint that separates the two regions of HSA1 con¬ 
served on SSC9ql2—> (l/3q21) and SSC9q23—>qter. The gene 
order CACNAlE-GLUL-Clorfl6-PTGS2 is conserved with 
respect to the human. 
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Fig-1 . Localisation of genes onto the SSC9 IMpRH map. Genes mapped 
are in bold and are compared with their comparative human chromosome 
and position (http://www.ensembl.org/). The IMpRH map is compared with 
the SSC9 cytogenetic and linkage map (http://abcenter.coafes.umn.edu/ 
RHmaps/chromosome/chromosome9.html). 
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HSA 7 homologies with SSC9 and SSC3 
Nine porcine ESTs homologous to genes on HSA7 were 
assigned using the INRA-SCHP. The genes GUSB, CPSF4 and 
THG-1 were localised to SSC3, while six (GNAI1, ABCB1, 
STEAP, AKAP9, CYP51A1 and SGCE) were mapped to 
SSC9. 

In this study, GUSB was mapped to SSC3, while GNAI1 
was mapped to SSC9. Two loci, POR and ZP3A, located 
between GUSB and GNAI1 on HSA7 (http://www.ncbi.nlm. 
nih.gov/mapview/map_search.cgi), have previously been map¬ 
ped to SSC3pl6—>pl7 (Lahbib-Mansais et al., 2000) and 
SSC3pl5->pter (Bruch et al., 1996) respectively. Combining 
the results from Lahbib-Mansais et al. (2000), Bruch et al. 


(1996) and this study, the breakpoint in conserved synteny 
between HSA7 and pig chromosomes 3 and 9, has been refined 
to the 3.7 Mb interval (http://www.ensembl.org/) between 
GNAI1 and ZP3A on HSA7. 

Similarly, at the distal end of the HSA7 block conserved on 
SSC9, another evolutionary breakpoint was further refined. 
The break was found to occur within a 1.7-Mb (http://www.en- 
sembl.org/) region between TAC1 (SSC9ql2-^(l/3q21); Lah¬ 
bib-Mansais et al., 1999) and CPSF4 (SSC3, from this study) on 
HSA7. 

GNAI1 and ABCB1 were mapped to SSC9qll or SSC9(2/ 
3q21)->q26 with equal probability by the INRA-SCHP. 
STEAP, AKAP9, CYP51 Al and SGCE from the same block on 
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HSA7 were mapped to 9ql2 —> (l/3q21). To determine the pre¬ 
cise position of GNAI1 and ABCB1 on SSC9 and to distinguish 
between the two regions proposed by the INRA-SCHP, GNAI1 
was mapped using the IMpRH panel. Results showed that 
GNAI1 was significantly linked to SW866. Based on the posi¬ 
tion of SW866 on the IMpRH map (Fig. 1), the most likely 
position of GNAI1 and ABCB1 is SSC9(2/3q21) —> q26. 
ABCB 1 was not mapped on the IMpRH panel due to technical 
difficulties with the genotyping. 

Radiation hybrid mapping of STEAP, AKAP9 and SGCE 
showed that gene order is conserved between humans and pigs 
for these genes (Fig. 1). However, the mapping results for 
ABCB 1 and GNAI 1 indicate that rearrangements in gene order 
have occurred within this HSA7 block conserved on SSC9. 

HSA11 homologies with SSC9 

R0B04, SIAT4C and ETS1 are located on HSAllq23—> 
q24 (http://www.ncbi.nlm.nih.gov/mapview/map_search.cgi). 
Using the INRA-SCHP, R0B04 and ETS1 were physically 
mapped to SSC9pll —> (l/2pl3). In this study, ROB04, 
SIAT4C and ETS1 were also mapped using the IMpRH panel. 
SIAT4C and ETS1 were significantly linked to SO 109, which 
has been physically mapped to 9pl2 by in situ hybridisation 
(Ellegren et al., 1994). Therefore, the INRA-SCHP and IMpRH 


mapping results agree that the distal end of HSA11 is conserved 
on the proximal part of SSC9p. These localisations also support 
the correspondence between HSA11 and SSC9 previously iden¬ 
tified by chromosome painting and gene mapping (http://www. 
toulouse.inra.fr/lgc/pig/compare/SSCHTML/SSC9S.HTM). 
The gene order R0B04-SIAT4C-ETS1 is conserved with 
respect to the human. 

Evolutionary Breakpoint on SSC9 

MAPKAPK2 (HSAlq32) and STEAP (HSA7q21) were 
mapped on the IMpRH panel, adjacent to the markers 
SWR1939 and SSC8B04 respectively, approximately 64 cR 
apart (Fig. 1). Hawken et al. (1999) estimated that the first-gen¬ 
eration porcine whole-genome radiation hybrid map had an 
estimated ratio of ~70 kb/cR. Therefore, the evolutionary 
breakpoint between MAPKAPK2 and STEAP on SSC9 has 
been refined to an interval of ~ 4.5 Mb. 
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Abstract. We report here the localisation of BAIAP1 
(13q24), HTR1F (13q45), PTPRG (13q23) and UBE1C 
(13q24) by fluorescence in situ hybridisation (FISH), and 
BAIAP1 (Swr2114; 21 cR; LOD = 11.03), GATA2 (Sw2448; 37 
cR; LOD = 8.26), IL5RA (Swr2114; 64 cR; LOD = 3.85), 
LMCD1 (Sw2450; 61 cR; LOD = 4.73), MME (CP; 50 cR; 
LOD = 7.75), RYK (Swc22; 12 cR; LOD = 18.62) and SGU003 
(Swl876; 6 cR; LOD = 16.99) by radiation hybrid (RH) map¬ 
ping to porcine chromosome 13 (SSC13). The mapping of these 
10 different loci (all mapped to human chromosome 3; HSA3) 
not only confirms the extended conservation of synteny be¬ 


tween HSA3 and SSC 13, but also defines more precisely the 
regions with conserved linkage. The syntenic region of the cen- 
tromeric part of SSC 13 was determined by isolating porcine 
bacterial artificial chromosome (BAC) clones (842D4 and 
1031H1) using primers amplifying porcine microsatellite 
markers S0219 and S0076 (mapped to this region). Sequence 
comparison of the BAC end sequences with the human genome 
sequence showed that the centromeric part of SSC13 is homol¬ 
ogous with HSA3p24. 

Copyright©2003 S. Karger AG, Basel 


SSC 13 is of particular interest to the pig breeder community 
due to the presence of several economically important genes. 
Most prominent amongst these is (are) the gene(s) mediating 
resistance/susceptibility against K88 (F4) caused neonatal diar¬ 
rhea, mapped to the middle part of SSC 13 (Edfors-Lilja et al., 
1995; Python et al., 2002). Apart from this, QTLs for several 
production traits such as growth rate and body weight (Knott et 
al., 1998; Wu et al., 2002) have been mapped to this chromo¬ 
some. Positional cloning of the genes involved, greatly benefits 
from the availability of well developed and integrated linkage, 
RH and cytogenetic maps. Good genetic maps also provide 
anchor points for the construction and ordering of BAC contigs 
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which themselves form the basis for large scale sequencing 
efforts. 

Comparative mapping between species with marker (gene) 
dense maps and species with less dense maps is an efficient way 
to enrich the maps of the latter (O’Brien et al., 1993). Zoo-FISH 
analysis and chromosome painting have shown an extended 
conservation of synteny between HSA3 and SSC 13 (Retten- 
berger et al., 1995; Fronicke et al., 1996; Goureau et al., 1996; 
Chowdhary et al., 1998). SSC 13 chromosome bands q47 ->q49 
correspond to HSA21 while the bulk of SSC 13 corresponds to 
HSA3 (Rettenberger et al., 1995; Fronicke et al., 1996; Gou¬ 
reau et al., 1996). Although HSA3 is completely homologous 
with SSC 13, several intrachromosomal rearrangements were 
observed in previous comparative mapping studies (Sun et al., 
1999; Van Poucke et al., 1999, 2001). Based on these results, 
HSA3 could be divided into four major chromosome segments: 
block 1 (HSA3p26->p24) is more or less equivalent to 
SSC13q31; block 2 (HSA3p23—>pl4.2) contains bands 
SSC13q21->q22; block 3 (HSA3pl 1->q21) extends from 
SSC13q41 to q46; block 4 (HSA3q21 ->q29) comprises 
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Table 1 . Summary of new PCR primers and 

conditions of loci used in this Study Locus PCR primers PCR fragment description PCR conditions 

Size (bp) Position in gene 3 Ta Remarks b 


B Al AP1 

AATGCCACCTTGCTGACC 

ATT GAG ATT CCT GCTTT GGTTT 

252 

exon 20-21 

57 °C 

2 mM MgCl 2 

GATA2 

GACTCGCAGGGCAACC 

C AGT GGCGT CTT GG AG A A 

646 

exon 4-5 

60 °C 


HTR1F 

TCTCTATGCCTCCTCTATTCTGG 

CCT ACTT GCTT GTCT CTT GT GG 

197 

exon 1 

57 °C 


IL5RA 

GC ACCT GGCTT GTT GG 
TCTTTGCTGTATTCTTGGCATT 

174 

exon 7-8 

59 °C 


LMCD1 

GACAGCCCCGTGGTCTA 

TT CCAGAAGTAGAT GAGGT CCA 

113 

exon 6 

58 °C 


MME 

T AGCAGAGGCGGGGAAC 

TTT AT CAT C AGT GCC A AC AA AA A 

294 

exon 9-10 

59 °C 

2.5 mM MgCl 2 

PTPRG 

GACT CT AAGC AC AGCG ACT AC A 
TTCCC AAAT CAT CCTCC A 

318 

exon 23-24 

57 °C 


RYK 

T C AC AG AC AAT GCCCT CTCC 

CCAGACTTTCAAGAGCCATCC 

102 

exon 7 

61 °C 


SGU003 

GTGTCCAAGCTGCGTCCT 

AGT AAT A AGC A A AAGCC AC ATT C A 

110-140 

unknown 

59 °C 

40 cycles 

UBE1C 

GGACTCTATCATCGCCAGAA 

C ATCT AT C A A AGGG AC A AT GG A 

412 

exon 8-9 

60 °C 

2 mM MgCl 2 


Data are according to human exon numbers at LocusLink database (http://www.nebi.nlm.nih.gov/LocusLink/). 
PCR conditions different from Materials and methods. 


SSC13q31->q41. A fifth block can be defined on SSC13 
extending from q47 to q49 (telomere). It corresponds to the 
complete HSA21 (Tuggle et ah, 2001). Two chromosomal 
regions of SSC13 remained uncharted after these efforts: the 
proximal end (bands ql 1 —>ql4) and a central part between 
blocks 1 and 2 (bands q23-> q24). The aim of the present study 
was to identify genes mapping to these two regions in order to 
define the corresponding parts of HSA3 and to refine the over¬ 
all map of SSC13. 

Materials and methods 

Primer design, polymerase chain reaction (PCR) and sequencing 

Information on new primers is presented in Table 1. Primers, designed 
with the program Primer3 (Rozen and Skaletsky, 1998), were based on 
human or porcine sequences from the GenBank database. Except for the 
SGU003 primers that were based on a sequence fragment of porcine BAC 
clone 505B4, isolated in a previous study using primers for B4GALT4 (Van 
Poucke et al. 2001). These primers amplify a microsatellite that was retrieved 
by hybridisation of 505B4 plasmid subclones with a radioactively labeled 
(CA) n -probe. Primers and PCR conditions for amplifying microsatellites 
S0219 and S0076 were as originally described by Robic et al. (1994) and 
Wintero et al. (1994) respectively. 

PCR was performed on 250 ng porcine genomic DNA with 70 ng of each 
primer, 1.5 mM MgCl 2 , 200 pM dNTPs and 0.5 U Platinum Taq DNA Poly¬ 
merase (Invitrogen, Merelbeke, Belgium). A 10-min denaturation at 94 °C 
was followed by 30 cyles (30 s at 94 0 C, 30 s at Ta and 1 min at 72 0 C) and a 
final extension of 10 min at 72 °C. The obtained products were cloned in 
pCRII (Invitrogen, Groningen, The Netherlands) and sequenced using the 
ALFexpress™ Autoread™ Sequencing Kit (Pharmacia, Uppsala, Sweden). 
BAC end sequences were obtained by sequencing on purified BAC DNA 
using the vector forward (F; 5-CGACGTTGTAAAACGACGGCCAGT-3) 
and reverse (R; 5 '-C AC AGGAAAC AGCT AT G ACC AT GATT AC G- 3 ) 
primers. The obtained sequences were compared with the databases using 
the NIX program (http://www.hgmp.mrc.ac.uk/Registered/Webapp/nix/) 
and submitted to the GenBank database. The accession numbers are listed in 
Table 2. 


BAC library screening and FISH mapping 

Primers for the genes BAIAP1, HTR1F, PTPRG, UBE1C and for the 
microsatellite markers S0219 and S0076 were first shown not to amplify the 
same fragment on bacterial DNA. Then they were used to screen the porcine 
BAC library of Rogel-Gaillard et al. (1999) by PCR. The obtained BAC 
clones were verified by sequencing PCR products from BAC DNA with the 
respective primers. BAC DNA for the four genes was used to map them on 
the porcine genome by FISH, performed as described earlier (Yerle et al., 
1994). Briefly, 100 ng of total BAC DNA was labeled by incorporation of 
biotinylated 16-dUTP (Boehringer Mannheim) using random priming. To 
suppress background caused by repetitive sequences, the labeled probes, son¬ 
icated pig genomic competitor DNA (100-fold in excess), and salmon sperm 
DNA (1000-fold excess) were ethanol precipitated together and then redis¬ 
solved in 30 pi of hybridisation medium. A prehybridisation was performed 
3 h at 37 °C, then hybridisation was carried out for 48 h at 37 °C. Standard 
protocols were followed for post-hybridisation washings and detection of flu¬ 
orescent signals. 

RH mapping 

Primers for BAIAP1, GATA2, IL5RA, LMCD1, MME, RYK and 
SGU003 were first shown not to amplify the same fragment on hamster 
DNA. Then they were used to screen the 90 hybrids of the INRA-University 
of Minnesota porcine Radiation Hybrid (IMpRH) panel of Yerle et al. (1998) 
by PCR. A radiation hybrid map was built using the RHMAP3.0 statistical 
package (Lange et al., 1995). Analyses were performed under the equal reten¬ 
tion probability model. Using the RH2PT program, two point distances were 
calculated between all markers. Linkage groups were defined using a lod 
score threshold of 4.8. Multipoint analyses were then performed using 
RHMAXLIK. When the marker density was sufficient in each linkage group, 
a framework map was built with a threshold likelihood ratio >1000:1 (mini¬ 
mum log 10 likelihood difference of 3). For linkage groups where it was not 
possible to build a framework map, two markers within the linkage group 
were chosen and were oriented according to the genetic map (Rohrer et al., 
1996) and to the more probable order by calculating the likelihood of the 
several orders with RHMAP3.0. When the marker density was not sufficient 
to built a whole framework map, locations of the unlinked markers were 
defined by taking into account both multipoint RH analysis (best likelihood) 
and locations on the genetic and/or cytogenetic map (Milan et al., 2000). 
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Table 2. Summary of loci with new porcine sequences and mapping data 


Locus 

Acc. no. porcine 

Percentage coding sequence 

Location by FISH 

Location by ImpRH 


sequence 

identity with other sequences 

SSC 

HSA a 

Linked marker (distance + LOD score) 

B Al AP1 

AF386793 

118 bp: 94 % ABO 10894 (human) 

118 bp: 79 % BI889538 (zebrafish) 

13q24 

3pl4.1 

Swr2114 (21 cR; LOD = 11.03) 

GATA2 

AY280857 

222 bp: 90 % M68891 (human) 

222 bp: 98 % BG383421 (pig) 

— 

3q21 

Sw2448 (37 cR; LOD = 8.26) 

HTR1F 

AF386794 

197 bp: 92 % AF498981 (human) 

197 bp: 98 % AF255663 (pig) 

13q45 

3pl2 

— 

IL5RA 

AY280858 

100 bp: 94 % NM 000564 (human) 
100 bp: 89% AB056101 (rat) 

— 

3p26-p24 

Swr2114 (64 cR; LOD = 3.85) b 

LMCD1 

AY280859 

113 bp: 92 % AF169284 (human) 

113 bp: 93 % BF039373 (cow) 

— 

3p26-p24 

Sw2450 (61 cR; LOD = 11.03) 

MME 

AY280860 

181 bp: 89 % Y00811 (human) 

181 bp: 99 %E01557 (pig) 

— 

3q25.1-q25.2 

CP (50 cR; LOD = 7.75) 

PTPRG 

AF386795 

126 bp: 96 % L09247 (human) 

126 bp: 92 % U38349 (chicken) 

13q23 

3pl4.2 

— 

RYK 

AY280861 

102 bp: 97 % X69970 (human) 

102 bp: 100 % CA779972 (pig) 

— 

3q22 

Swc22 (12 cR; LOD = 18.62) 

SGU003 

AY280862 

— 

13q41-q42 c 

3ql3.3 d 

Swl878 (6 cR; LOD = 16.99) 

UBE1C 

AF386796 

104 bp: 98 % BC022853 (human) 

104 bp: 97 % CB224894 (cow) 

13q24 

3p 13 



a Data are from The Genome Database (http://gdbwww.gdb.org/gdb/). 
b The RH mapping of IL5RA should be taken with caution as the LOD score is very low. 

c SGU003 is located in BAC 505B4. This clone, containing B4GALT4, was mapped by FISH in a previous study (Van Poucke et al., 2001). 
d Mapping data for B4GALT4. 


Results 

Characterisation of the PCR products 
Ten new primer pairs, based on sequences of loci localised 
on HSA3, were developed to amplify a PCR product on porcine 
genomic DNA (Table 1). The accession numbers of the porcine 
sequences are shown in Table 2, together with the percentages 
of coding sequence identity with other sequences. All intron 
sequences are flanked by the consensus GT...AG splice sites. 
The SGU003 primers amplify a microsatellite located in por¬ 
cine BAC clone 505B4, which also contains B4GALT4. The 
exact positions of SGU003 and B4GALT4 in the BAC clone 
were not determined. 

BAC library screening and FISH mapping 
The obtained BAC clones (509H4 for BAIAP1, 913D 12 for 
HTR1F, 492A3 for PTPRG and 293F4 for UBE1C), contain¬ 
ing at least a part of the respective genes, were mapped by 
FISH. For each BAC clone, twenty five metaphases were ana¬ 
lyzed after hybridisation. All the BAC clones hybridised to a 
specific chromosomal region and the signals appeared as bright 
double fluorescent spots on each chromatid of the two homolo¬ 
gous chromosomes. All loci were assigned to SSC13 and no 
background was observed. The partial porcine metaphase 
spreads showing the distinct FITC hybridisation signals are 
shown in Fig. 1. Measurements of the position of the fluores¬ 
cent spots on SSC13 allowed us to determine the precise local¬ 
isation of each BAC (Table 2). The location of the human 
orthologs on HSA3 are also listed in Table 2 and drawn as an 
integrated comparative map in Fig. 2. The genes described in 
previous studies have also been included here for comparative 
purposes. 


RH mapping 

Two-point distances and lod scores between the seven new 
loci and their closest marker on the first generation RH map are 
presented in Table 2. RYK was placed on the framework map, 
whereas the other six loci were added to the map at their most 
likely position, represented by a line at the right of the map. 
This map is drawn in Fig. 2 as the middle part of the integrated 
comparative map. 

Localisation of microsatellites S0219 and S0076 

Primer pairs amplifying microsatellites S0219 (Robic et al., 
1994) and S0076 (Wintero et al., 1994) were used to screen the 
porcine BAC library by PCR. BAC end sequences from one 
clone for each microsatellite, BAC 842D4 for S0219 and BAC 
1031H1 for S0076, were determined by sequencing directly on 
purified BAC DNA using the standard vector U and R primers. 
The obtained sequences were compared with the databases 
using the NIX program and submitted to the GenBank data¬ 
base. The 504-bp sequence 842D4-U (Ace. no. AY282782) 
contained a PRE1_SS repeat (SINE/PIG family) and was not 
useful for comparative purposes with the human genome draft 
sequences since no other significant sequence identities were 
found. The complete 780-bp 842D4-R sequence (Ace. no. 
AY282783) showed 76% sequence identity with the HSA3 
clone RP11-72112 (Ace. no. AC137672). This clone is part of 
the HSA3 genomic contig NT_028123 and is located on the 
UCSC Genome Browser on Human (April 2003; http:// 
genome.ucsc.edu/) at base position 18.000.000, between the 
known genes TBC1D5 and KCNH6 (both located on 
HSA3p24.3 according to the LocusLink database). Despite the 
high percentage of sequence identity with human, no signifi¬ 
cant sequence identities were found with known genes, mRNAs 
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Fig 1. Partial porcine metaphases showing distinct FITC hybridization signals from (A) BAIAP1, (B) HTR1F, (C) PTPRG and 
(D)UBEIC. 
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Fig 2. Integrated comparative map between SSC13 and HSA3 with proposed conserved syntenic groups. Locus designations in 
red were mapped in this study. The mapping of the other porcine loci has been described in Rohrer et al. (1996), Hawken et al. 
(1999), Pinton et al. (2000) and Van Poucke et al. (2001). Mapping data of the human loci is from The Genome Database (http:// 
gdbwww.gdb.org/gdb/) and the genes were ordered according to their position in the UCSC Genome Browser on Human (April 
2003; http://genome.ucsc.edu/). 


or ESTs. The first 221 bp of the 529-bp 1031H1-F sequence 
(Ace. no. AY282784) showed 68% sequence identity with the 
HSA3 clone RP11-103N21 (Ace. no. AC099048). This clone is 
part of the HSA3 genomic contig NT_022517 and is located on 
the UCSC Genome Browser on Fiuman (April 2003) at base 
position 30.000.000. The clone contains the 3' side of RBMS3, 
a gene located on HSA3p24->p23 according to the LocusLink 


database. Our BAC-end sequence showed no significant se¬ 
quence identities with known genes, mRNAs or ESTs. The 471- 
bp 1031H1-R sequence (Ace. no. AY282785) gave no database 
hits. The location of the HSA3 genomic contigs, that are ortho- 
logous with the porcine BAC clones containing microsatellite 
markers S0219 and S0076, is shown at the right side of Fig. 2. 
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Discussion 

Differences between the porcine sequences of GATA2, 
HTR1F and MME obtained in this study, and other porcine 
sequences from the database are either SNPs or sequencing 
errors. The 222-bp coding sequence of GATA2 showed three 
differences with the porcine EST BG383421. The A 495 —> G and 
G 579 —> A differences cause no amino acid changes and the dele¬ 
tion of G 532 of our sequence in the porcine EST was probably a 
sequencing error in the porcine EST since our sequence en¬ 
codes the GATA2 protein. The 197-bp coding sequence of 
HTR1F showed two differences with the porcine mRNA 
sequence AF255663. The A 72 —>C difference causes an N->H 
amino acid change, whereas the C 176 —> T difference causes no 
amino acid change. The 181-bp coding sequence of MME 
showed only one difference with the porcine cDNA sequence 
E01557. The A 22 ->G difference causes no amino acid change. 

The localisation of BAIAP1, GATA2, HTR1F, IL5RA, 
LMCD1, MME, PTPRG, RYK, SGU003 and UBE1C (all 
mapped to HSA3) to SSC13, and the identification of homolo¬ 
gous regions on HSA3 by sequence comparison with porcine 
BAC end sequences, not only confirms the extended conserva¬ 
tion of synteny between HSA3 and SSC13, but also defines 
more precisely the regions with conserved linkage. We have 
more evidence that the complete SSC13 consists of only five 
blocks showing conserved linkage with HSA3 and HSA21 
(Fig. 2). Current available mapping data (mostly FISH) suggest 
that the gene order in the blocks is conserved between pig and 
human. However, high-resolution RH mapping and eventually 
sequencing should be performed to reveal the exact gene 
order. 

The first block, a part of SSC13q31, is homologous with 
HSA3p26->p25. This region is now defined by seven genes, 
from IL5RA to XPC. The mapping of IL5RA on the RH map 
indicates that the genes of this block are probably located in the 
same orientation. 

Block 2 now comprises the telomeric part of SSC13 above 
block 1 (SSC13ql 1->q24). It shows conserved linkage with 
HSA3p24->pl2 in the same orientation. The upper part of this 
block was defined by sequence comparison of 842D4 and 
1031H1 BAC end sequences. Since RARB and TOP2B were 
located next to each other according to the Human Genome 
Browser and since both genes were located between KCNH 6 
and RBMS3 (the genes closest to the orthologous 842D4 and 
1031H1 BAC sequences, both belonging to block 2) in the same 
genomic contig NT 022517, we can conclude that RARB is not 
the lower gene of block 1 , as described earlier, but a gene of 
block 2. This contradicts our previous mapping of RARB to 
SSC13q31 (Van Poucke et al., 2001), but is in favour of the 
mapping to SSC 13q 11 -> q 14 by Sun et al. (1999). The chromo¬ 
some breakpoint between block 1 and 2 in human should thus 
be located between XPC and KCNH 6 , probably close to 
TBC1D5. The lower part of block 2 is defined by MITF. All 19 
genes of block 2 could be placed in the same order according to 
their position in the Human Genome Browser. High-resolution 
mapping with the RH panel will reveal if this is the exact gene 
order or if more intrachromosomal rearrangements occurred 
during mammalian evolution. Although CCK was no longer 


present in the latest version of the Human Genome Browser, 
we included it based on its location in an earlier version and 
based on its cytogenetic location. However care should be taken 
in using all these mapping data because of discrepancies 
between the different databases. UBE1C was located by FISH 
in this study to SSC13q24, as was expected according to our 
comparative map. However, Lahbib-Mansais et al. (2003) 
mapped the same gene to SSC 1 q2.7. Further studies will proba¬ 
bly show that these different locations are caused by pseudo¬ 
genes or new family members. 

Block 3 comprises SSC 13q41 -> q46 and is homologous with 
HSA3pl 1 —>q21. It contains 11 genes, but they are probably 
localised in the reverse order if no other intrachromosomal 
rearrangements have occurred. The chromosomal breakpoint 
between block 2 and 3 in human should be located between 
MITF and POU1F1. 

Block 4 is located at SSC13q31 ->q41 and corresponds to 
HSA3q21 ->q29. All 24 genes of block 4 could be placed in the 
same order according to their position in the Human Genome 
Browser. Again, to confirm our hypothesis, every single gene of 
this block should be mapped on the RH panel. Although 
PBXP1 and APOD were no longer present in the latest version 
of the Human Genome Browser, we included them based on 
their location in an earlier version and based on their cytogen¬ 
etic location. The chromosomal breakpoint between block 3 
and 4 in human should be located between ZNF148 and 
GATA2. 

Block 5 comprises the conserved synteny between 
SSC13q47->q49 and HSA21 as described by Tuggle et al. 
( 2001 ). 
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Abstract. Genes located on human chromosome 12 
(HSA12) are conserved on pig chromosomes 5 and 14 (SSC5 
and SSC14), with HSA12q23.3->q24.11 harboring the evolu¬ 
tionary breakpoint between these chromosomes. For this study, 
pig sequence-tagged sites (STS) were developed for nine 
HSA12 genes flanking this breakpoint. Radiation hybrid (RH) 
mapping using the IMpRH panel revealed that COL2A1, 
DUSP6, KITLG, PAH and STAB2 map to SSC5, while PXN, 
PLA2G1B, SART3 and TCF1 map to SSC14. Polymorphisms 


Comparative mapping utilizes information from species 
such as human and mouse for which dense maps or complete 
genome sequences are available. This information contributes 
to construction of high-resolution genome maps in other spe¬ 
cies such as the pig, and these maps aid in identification of can¬ 
didate genes for economically important traits. Furthermore, 
elucidation of breakpoint positions and gene order rearrange¬ 
ments between species provides information regarding chro- 
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identified in COL2A1, DUSP6, PAH, PLA2G1B and TCF1 
were used for genetic linkage mapping and confirmed the map 
locations for these genes. Our results indicate that the HSA12 
evolutionary breakpoint occurs between STAB2 and SART3 in 
a region spanning less than five million basepairs. These results 
refine the comparative map of the HSA12 evolutionary break¬ 
point region and help to further elucidate the extensive gene 
order rearrangements between HSA12 and SSC5 and 14. 

Copyright©2003 S. Karger AG, Basel 


mosomal evolution. Chromosomal segments with conserved 
synteny between human and pig have been broadly defined 
using bidirectional chromosomal painting (Goureau et al., 
1996). Conservation of synteny was observed between human 
chromosome (HSA) 12 and pig chromosomes (SSC) 5 and 14. 
However, chromosomal painting does not provide information 
about gene order and thus cannot precisely resolve the location 
of the HSA 12 evolutionary breakpoint. 

Several HSA 12 genes have been mapped in the pig (Archi¬ 
bald et al., 1995;Rohreret al., 1996,1997; Hawkenet al., 1999; 
Lahbib-Mansais et al., 2000, 2003; Pinton et al. 2000; Goureau 
et al., 2001). These efforts have largely involved either frame¬ 
work map construction or elucidation of gene order rearrange¬ 
ments observed on SSC5, and none have directly considered 
the position of the evolutionary breakpoint on HSA 12. Map¬ 
ping of HSA 12 genes to SSC5 has revealed extensive gene order 
rearrangements in this conserved region (Rohrer et al., 1997; 
Goureau et al., 2001; Lahbib-Mansais et al., 2003). Thus, map¬ 
ping of additional markers is needed to further resolve gene 
order on SSC5 and precisely define the HSA 12 breakpoint. The 
objective of this study was to further develop the comparative 
map of this region. 
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Table 1. Pig sequence tagged sites developed for human chromosome 12 genes 


Gene 3 

Gene name 

Primer sequences 

Product size 

PCR conditions 

Polymorphism 0 




(bp) 

(T A /MgCl 2 /Primer) b 


COL2A1 

collagen type II alpha 1 

F-5’TGGTGGAGCAGCAAGAGC 3 ,d 

R-5’CCTTCTTGAGGTTGCCAGC 3 ,d 

547 

62/2.0/0.5 

Rsal RFLP 

DUSP6 

dual specificity phosphatase 6 

F-5TGCAGGTCTCTACGTTCCAA 3 ,e 

R-5’GCCAAGCAATGCACCAAG 3 ,e 

978 

55/1.5/0.5 

Mspl RFLP 

KITLG 

KIT ligand 

F-5’ATCCATTGATGCCTTCAAGG 3 ,d 

-1100 

62/2.0/0.5 




R-5’CTGTCATTCCTAAGGGAGCTG 3 ,d 

F-5’AGCTGGCTGCAACAG 3 ,e 

R-5 ’CCTCATTGTCCCTATAACAC3 ,e 

353 

61/1.5/0.5 


PXN 

paxillin 

F-5TCTTCTCTCCACGCTGCTACTACT 3 ,e 

R-5 ’CAGCAAGGCTCTCTTCTTCCCACC3 ,e 

520 

55/1.5/0.5 


PAH 

phenylalanine hydroxylase 

F-5’CTGTGGAGTTTGGGCTTTGC 3 ,d 

R-5’AGTCCTCACTTTCTCCTTGGCA 3 ,d 

-1000 

56/2.0/0.5 




F-5TTTCTCCTTGGCATCACTG 3 ,e 

R-5’GGGGCAAATTACTCCTTTCT 3 ,e 

342 

55/1.5/0.5 

SSCP 

PLA2G1B 

phospholipase A2, group IB 

F-5’GACTACGGCTGCTACTGTG 3 ,f 

R-5’TTACAGCTGGCCAGTTTCTT 3 ,f 

625 

59/1.5/2.0 

Dpnll RFLP 

SART3 

squamous cell carcinoma antigen 

F-5 ’ GT GT GG AT A AG AGC AA A AACCC 3 ,e 

871 

59/1.5/0.5 



recognized by T cells 3 

R-5’GTGACCAGCCTGAGGTCCT 3 ,e 




STAB2 

stabilin 2 

F-5 ’TCAGAGAATTAAATACTGAACCCAGAG 3 ,e 

R-5’CTCAGTGGCCAGGCATAAG 3 ,e 

-1650 

57/1.5/0.5 


TCF1 

transcription factor 1, hepatic 

F-5 ’CCCAGCAGATCCTGTTCC 3 ,d 

R-5’CTCCGTGACAAGGTTGGAG 3 ,d 

602 

58/1.5/0.5 

SNP/SSCP 8 


a GenBank accession numbers BV012665-BV012674. 

b T A PCR annealing temperature in °C; MgCl 2 , concentration in mM; Primer, concentration in p,M. 

c RFLP, Restriction fragment length polymorphism; SSCP, Single-stranded conformational polymorphism; SNP, Single nucleotide polymorphism. 
d Primer source CATS project (Lyons et al., 1997). 
e Primer designed for this study from pig or human sequence. 

1 Primer source UM-STS project (Venta et al., 1996). 

8 C/A transversion SNP identified in TCF1 intron sequence using a pool-and-sequence method. SSCP analysis used for genotyping. 


Materials and methods 

Amplification of HSA12 genes 

HSA12 genes encoding collagen type II alpha 1 (COL2A1), dual specifici¬ 
ty phosphatase 6 (DUSP6), KIT ligand (KITLG), paxillin (PXN), phenylala¬ 
nine hydroxylase (PAH), phospholipase A2, group IB (PLA2G1B), squa¬ 
mous cell carcinoma antigen recognized by T cells 3 (SART3), stabilin 2 
(STAB2) and hepatic transcription factor 1 (TCF1) were amplified using het¬ 
erologous primers (Table 1). The PCR was performed using 25 ng genomic 
DNA in 10-p.l reactions containing lx PCR buffer (Promega, Madison, WI), 

1.5 or 2.0 mM MgCl 2 , 200 pM each dNTP, 0.5 or 2.0 pM each primer and 
0.5 U Taq DNA polymerase (Promega, Madison, WI). The PCR profiles 
included an initial denaturation of 3 min at 94 0 C followed by 30 cycles of 
94 °C for 1 min, 55-62 °C for 1 min, 72 °C for 1 min and a final extension of 
72 °C for 10 min. PCR products were visualized on 1 % agarose gels contain¬ 
ing 0.4 pg/ml ethidium bromide, fragments were sequenced to confirm pig 
STS identities and sequences were submitted to the GenBank dbSTS data¬ 
base (accession numbers BV012665-BV012674). 

Radiation hybrid and cytogenetic mapping 

RH mapping was performed using the INRA-University of Minnesota 
7,000-rad radiation hybrid panel (Yerle et al., 1998; IMpRH; Hawken et al., 
1999). COL2A1, KITLG, PAH and TCF1 were also physically mapped using 
a pig-rodent somatic cell hybrid panel (SCHP; Robic et al., 1996; Yerle et al., 
1996). Large size differences allowed pig and rodent products for PAH, 
SART3 and TCF1 to be distinguished. Pig COL2A1 products were distin¬ 
guished from rodent by digestion with Dpnll, which cleaved only rodent. Pig 
STAB2 products were distinguished by digestion with Hindlll, which 
cleaved only pig. PCR conditions were as previously described except that 

12.5 ng of hybrid DNA was used. SCHP PCR products were visualized on 
1 % agarose gels, and regional assignments were determined as described in 


Chevalet et al. (1997)by submitting data to the INRA SCHP database (http:// 
www.toulouse.inra.fr/lgc/pig/hybrid.htm). The IMpRH panel was screened 
twice for each marker and products were visualized on 3 % agarose gels. Each 
hybrid was scored as positive, negative or ambiguous, and two-point analysis 
of RH data was performed using the IMpRH server mapping tool as outlined 
by Milan et al. (2000; http://imprh.toulouse.inra.fr/). Multipoint analysis 
with maximum likelihood criteria was subsequently performed to integrate 
the new markers with the map of Hawken et al. (1999) using the RHMAP v. 
3.0 package (Boehnke et al., 1991; Lange et al., 1995; http://www.sph. 
umich.edu/statgen/boehnke/rhmap.html). RH maps were drawn using the 
MapCreator software (http://www.wesbarris.com/mapcreator/) and aligned 
next to chromosome ideograms based on the standard porcine karyotype as 
described by Gustavsson (1988). 

Polymorphism identification and genetic linkage mapping 

A DNA pooling strategy was used to identify potential RFLP in each STS 
(Sun et al., 1998). Equal quantities of DNA from 22 PiGMaP F 0 individuals 
were used to create a grandparent pool (GPP). Each STS was amplified from 
the GPP and products were digested with a panel of restriction endonu¬ 
cleases. Digested fragments were electrophoresed on 2 % agarose gels, and 
those showing a banding pattern consistent with the presence of two possible 
alleles were further evaluated using PiGMaP individuals. STS not containing 
an observable RFLP were evaluated by SSCP analysis using a Bio-Rad D- 
Code Mutation Detection System (Bio-Rad Laboratories, Hercules, CA). 
PCR products were denatured and electrophoresed on 8 % native polyacryl¬ 
amide gels with 5 % glycerol at 4 0 C. Gels were run at a constant power for 
6-12 h and stained in a 1-pg/ml ethidium bromide solution. Following 
optimization of conditions, SSCP alleles were used to genotype individuals 
from informative PiGMaP families. 

The TCF1 STS was used to evaluate a pool-and-sequence strategy for 
identifying single nucleotide polymorphisms (SNPs). TCF1 was amplified 
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Table 2. Summary of cytogenetic, radiation hybrid (RH) and genetic linkage map locations of HSA12 genes in the pig 


Gene 3 

Human location 


Pig location 




Cytogenetic 

Genome sequence 
(bp from pter) b 

GB4 position 0 
(CR 3000 ) 

Cytogenetic 

RH map d 
(marker, LOD) 

Genetic linkage map d 
(marker, LOD) 

COL2A1 

12ql3.ll 

48,083,498 

211.47 

5q 12-l/2q21 or 
5q25 e 

5 (SW986, 13.29) 

5 (IGF 1,6.27) 

KITLG 

12q21.32 

88,823,545 

346.64 

5q25 e 

5 (DUSP6, 16.87) 

n.d. f 

DUSP6 

12q21.33 

89,675,094 

357.50 

n.d. 

5 (KITLG, 16.87) 

5 (SW1954, 5.12) 

PAH 

12q23.2 

103,165,051 

399.61 

5q25 e 

5 (IGF 1, 15.47) 

5 (S0018, 4.58) 

STAB2 

12q23.3 

103,914,016 

n.d. 

n.d. 

5 (COL2A1, 9.49) 

n.d. 

SART3 

12q23.3 

108,849,304 

n.d 

n.d. 

14 (SW1527, 16.69) 

n.d. 

PXN 

12q24.23 

120,431,110 

464.39 

n.d. 

14 (SW1321, 11.81) 

n.d. 

PLA2G1B 

12q24.23 

120,542,772 

-465.41 s 

n.d. 

14 (PXN, 9.83) 

14 (SW295, 14.15) 

TCF1 

12q24.31 

121,199,402 

466.91 

14 h 

14 (PLA2G1B, 9.79) 

14 (S0058, 10.96) 


a COL2A1, collagen type II alpha 1; KITLG, KIT ligand; DUSP6, dual specificity phosphatase 6; PAH, phenylalanine hydroxylase; STAB2, 
stabilin 2; SART3, squamous cell carcinoma antigen recognized by T cells 3; PXN, paxillin; PLA2G1B, phospholipase A2, group IB; TCF1, 
transcription factor 1, hepatic. 

b Position of beginning of STS in the genomic sequence of human chromosome 12 based on the April 2003 assembly using the University of 
California Santa Cruz Genome Browser (http://www.genome.ucsc.edu). 

c Position of STS on GeneMap’99 as determined by the GeneBridge 4 (GB4) RH panel (http://www.ncbi.nlm.nih.gov/genemap99/). 
d Pig chromosome number. Marker with most significant linkage and corresponding LOD score shown in parentheses. 
e Regional assignment using a somatic cell hybrid panel (Robic et al., 1996; Yerle et al., 1996). Risk of error less than 0.1%. 

1 Not determined. 

g Position of an STS -5 kb upstream of PLA2G1B on clone RP11-144B2 (Accession # AC073930). 

h Location using a somatic cell hybrid panel (Yerle et al., 1996; Robic et al., 1996). Marker not regionally assigned due to limited 
informativeness of the panel for SSC14. Risk of error less than 0.5%. 


from the GPP and the resulting product was sequenced. An SNP was 
detected by examining this sequence for ambiguous base calls resulting from 
the presence of two different bases at the same position. The TCF1 fragment 
was digested with Dpnll (10,000 U/ml, New England Biolabs, Beverly, MA) 
to obtain smaller fragments, and SSCP analysis was used for subsequent 
genotyping of PiGMaP individuals. 

For STSs with an identified polymorphism, linkage analysis was per¬ 
formed using CRIMAP 2.4 (Green et al., 1990). Two-point linkages were 
computed between new markers and previously reported markers (Archibald 
et al. 1995; http://www.thearkdb.org/). Loci were inserted and order estab¬ 
lished by multipoint analysis using the ALL and FLIPS options, and the 
CHROMPIC option was used to identify potential genotyping errors. New 
maps were constructed using the BUILD option. 


Results and discussion 

Nine genes flanking the human-pig evolutionary breakpoint 
on HSA12 were mapped to SSC5 and SSC14 (Table 2 and 
Fig. 1). Using the IMpRH panel, COL2A1, DUSP6, KITLG, 
PAH and STAB2 were mapped to SSC5, and PXN, PLA2G1B, 
SART3 and TCF1 were mapped to SSC14. Genetic linkage 
mapping confirmed the locations of COL2A1, DUSP6, PAH, 
PLA2G1B and TCF1. In addition, COL2A1, KITLG, PAH 
and TCF1 were physically mapped using an SCHP. However, 
this SCHP has a low retention of chromosomal fragments for 
both SSC5q and SSC14 (Lahbib-Mansais et al., 2000, 2003; 
Goureau et al., 2001) so the SCHP results should be interpreted 
with caution. 

Five genes mapped in this study are located within the inter¬ 
val HSA12q23.2—>q24.23, which harbors the breakpoint 
(Fig. 1). Three other genes in this interval have previously been 
mapped in the pig. Insulin-like growth factor-1 (IGF1) and 


tumor rejection antigen 1 (TRA1) both map to SSC5 (Archi¬ 
bald et al., 1995; Hawken et al., 1999; Lahbib-Mansais et al., 
2000), while d-amino acid oxidase (DAO) maps to SSC14 (Mel- 
link et al., 1993; Lahbib-Mansais et al., 2000). Our results sug¬ 
gest that STAB2 and SART3 flank the HSA12 breakpoint and 
the distance between these genes in the human genome is pre¬ 
dicted to be less than 5 Mbp (UCSC April 2003 assembly and 
NCBI Build 33; Table 2). TRA1 is located less than 350 kb dis¬ 
tal to STAB2 on HSA12 and these genes are closely linked on 
the SSC5 RH map. Thus, our results taken together with those 
of Lahbib-Mansais et al. (2000) suggest that the human-pig evo¬ 
lutionary breakpoint on HSA12 is located between TRA1 and 
SART3 in a region predicted to span less than 4.6 Mbp. 

To assess gene order rearrangements between human and 
pig, human gene order and distances between genes were deter¬ 
mined by searching current human genome map databases (Ta¬ 
ble 2). Results of this study confirm and extend previous obser¬ 
vations that, despite extensive conservation of synteny between 
HSA12 and SSC5, considerable intrachromosomal rearrange¬ 
ments have occurred during mammalian evolution (Johansson 
et al., 1995; Rohrer et al., 1997; Lahbib-Mansais et al., 2000, 
2003; Goureau et al., 2001). Lahbib-Mansais et al. (2003) iden¬ 
tified five evolutionary breakpoints between HSA12 and SSC5. 
Two conserved blocks defined by these breakpoints contained a 
single gene and our results have added an additional gene to 
each block. In addition, we have added another block contain¬ 
ing DUSP6 and KITLG to the distal end of SSC5q. Our results 
suggest that the most likely order for these genes is inverted 
between human and pig, but this result should be taken with 
caution because the two genes are very closely linked on the pig 
RH map and the order KITLG-DUSP6 is also possible. Map- 
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Fig. 1. Diagram illustrating the cytogenetic, RH and genetic linkage maps of SSC5 and SSC14 and the comparative map 
location of genes on HSA12. Genes mapped in this study are displayed in bold face type. Previously reported markers were used for 
RH map analysis, linkage analysis and map construction (Archibald et al., 1995; Rohrer et ah, 1996, Yerle et al., 1997; Hawken et 
al., 1999). RH map position is in centi-Rays, the sex-averaged linkage map position is in centi-Morgans and chromosome ideo¬ 
grams were drawn to represent the standard porcine karyotype described by Gustavsson (1988). Human cytogenetic locations were 
obtained from the April 2003 University of California Santa Cruz genome assembly (http://www.genome.ucsc.edu). 


ping of additional markers is necessary to confirm gene order 
within this block. Taken together, our results and those of Lah- 
bib-Mansais et al. (2003) predict six evolutionary breakpoints 
between HSA12 and SSC5 (Fig. 2). 

Very few known HSA12 genes have previously been map¬ 
ped to SSC14, although mapping of ESTs to this region (Rink et 
al., 2002) indicates that complex rearrangements have occurred 
between the human and pig genomes. Lahbib-Mansais et al. 
(2000) placed DAO on the SSC14 RH map and our results add 
four additional HSA12 genes to this region. Our revised SSC14 
RH map agrees with the improved SSC14 RH map proposed 
by Lahbib-Mansais et al. (2000) and addition of new genes to 
this map further improves the human-pig comparative map. 
Our results predict that these five genes are organized into two 
blocks with a breakpoint that inverts the block order between 
human and pig (Fig. 2). 


We report here the mapping of five HSA12 genes to SSC5 
and four HSA12 genes to SSC14, which narrows the HSA12 
evolutionary breakpoint region to less than 5 Mbp. This work 
improves the human-pig comparative maps for these chromo¬ 
somes and further elucidates the complex gene order rearrange¬ 
ments within blocks of conserved synteny observed between 
HSA12 and SSC5. This study also identifies a gene order rear¬ 
rangement between HSA12 and SSC14. Identification of poly¬ 
morphisms in five genes and addition of these markers to the 
genetic map also contributes to further integration of the cyto¬ 
genetic, RH and genetic maps for SSC5 and SSC14. In addi¬ 
tion, these polymorphisms contribute to the developing SNP 
collection for pigs. Several significant or suggestive quantita¬ 
tive trait loci (QTL) have recently been reported on SSC5 or 
SSC14 that encompass this breakpoint region (Cassady et al., 
2001; Malek et al., 2001a, 2001b; Desautes et al., 2002; Milan 
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et al., 2002) underscoring the importance of developing an 
accurate, high-resolution comparative map of this region to 
facilitate identification of positional candidate genes responsi¬ 
ble for the genetic variation observed in economically impor¬ 
tant traits. 
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Fig. 2. Diagram representing gene order rearrangements between HSA12 and SSC5 and 14. Genes 
mapped in this study are shown in bold face type. Other genes have been reported previously (Hawken 
et al., 1999; Lahbib-Mansais et al., 2000, 2003). Human gene order was obtained from the April 2003 
University of California Santa Cruz genome assembly (http://www.genome.ucsc.edu). The position of 
the HSA12 breakpoint leading to conserved segments on SSC5 and SSC14 is shown as a solid line. 
Internal breakpoints predicted by this study or by Lahbib-Mansais et al. (2003) to lead to rearrange¬ 
ments in gene order within these conserved segments are shown with broken lines (-). 
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Abstract. Several genes (PRKAA2, PRKAB1, PRKAB2, 
PRKAG3, GAA, GYS1, PYGM, ALDOA, GPI, LDHA, 
PGAM2 and PKM2), chosen according to their role in the regu¬ 
lation of the energy balance and in the glycogen metabolism 
and glycolysis of the skeletal muscle, were studied. Eleven sin¬ 
gle nucleotide polymorphisms (SNPs) were identified in six of 
these genes (PRKAB1, GAA, PYGM, LDHA, PGAM2 and 
PKM2). Allele frequencies were analyzed in seven different pig 
breeds for these loci and for a polymorphism already described 
for GPI and for three polymorphic sites already reported at the 
PRKAG3 locus (T30N, G52S and 1199V). Linkage mapping 
assigned PYGM and LDHA to porcine chromosome (SSC) 2, 
PKM2 to SSC7, GAA to SSC 12, PRKAB1 to SSC 14 and 
PGAM2 to SSC 18. Physical mapping, obtained by somatic cell 


hybrid panel analysis, confirmed the linkage assignments of 
PRKAB1 and GAA and localized ALDOA, PRKAB2 and 
GYS1 to SSC3, SSC4 and SSC6, respectively. Pigs selected for 
the association study, for which several meat quality traits were 
measured, were first genotyped at the PRKAG3 R200Q poly¬ 
morphic site (RN locus), in order to exclude carriers of the 
200Q allele, and then were genotyped for all the mutations con¬ 
sidered in this work. Significant associations (P < 0.001) were 
observed for the PRKAG3 T30N and G52S polymorphic sites 
with meat colour (L* at 24 h post mortem). PGAM2 and PKM2 
were significantly associated (P =0.01) with drip loss percent¬ 
age and glycogen content at one hour post mortem, respec¬ 
tively. 

Copyright©2003 S. Karger AG, Basel 


Conversion of muscle to meat is determined by several bio¬ 
chemical transformations that take place after the slaughtering 
of the animals. These processes depend in part by the energy 
supply of the muscle that can be expressed by the glycolytic 
potential (GP) measure. GP is a biochemical parameter that 
defines the quantity of glucide of the skeletal muscle susceptible 
to conversion into lactate during the post mortem phase and is 
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calculated as the sum of: 2[glycogen + glucose + glucose-6-phos- 
phate] + [lactate] (Monin and Sellier, 1985). Porcine muscles 
with high levels of this parameter usually show a lower ultimate 
pH that, in turn, affects other quality traits such as meat colour, 
water holding capacity, drip loss, tenderness and processing 
yield (i.e. Enfalt et al., 1997; Nanni Costa et al., 2000). 

GP has been shown to have a moderate heritability in pigs 
(0.14-0.25; Larzul et al., 1998) that do not belong to the Hamp¬ 
shire breed, in which this parameter is largely influenced by the 
PRKAG3 Q200 allele (RN- of the Rendement Napole locus; 
Milan et al., 2000) that increases the glycogen content by about 
70% in glycolytic muscles and causes the defect known as acid 
meat. 

Other studies (Andersson-Eklund et al., 1998; Malek et al., 
2001; Paszek et al., 2001; Ovilo et al., 2002) have identified 
several quantitative trait loci (QTL) for muscle GP, glycogen 
content, colour, pH and water holding capacity in pigs and oth- 
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Table 1. List of the investigated genes with amplification and polymorphism analysis protocols 


Function 

Gene 

Gene name 

Acc. no. d 

PCR primers (5'-3') b 

Use c 

Amplified 

PCR d 


symbol 





product 
size (bp) 


Energy 

PRKAA2 

protein kinase, AMP- 

AW428128** 

CAGCCCTAAAGCACGATGTC 

STS, SCHP analysis 

160 

63/1.5/R/M 

metabolism 


activated, alpha 2 

(NM_006252, 

TTCAAAA TCCA GCTGCTTCA 




regulation (AMPK 


catalytic subunit 

e-161,94%) 





complex) 

PRKAB1 

protein kinase, AMP- 

AV601419** 

GTGGATGGTCAGTGGACACA 

STS, SSCP analysis, PCR-RFLP 

411/412 

58/2.0/E/M 



activated, beta 1 non- 

(NM_006253, 

TGGGAA TCCA CCA TTAAA GC 

analysis (AluY) 





catalytic subunit 

0.0, 90%) 
AJ557222 

TGGTTGAGGTCCCTGCATA 
TGGCTGGGAA GA GA CAA GA C 

SCHP analysis 

193 

62/3.0/E/M 


PRKAB2 

protein kinase, AMP- 

AJ557223* 

TGTGGAAAACGTTTCCTGGT 

SCHP analysis; SSCP analysis 

154 

59/3.0/E/P 



activated, beta 2 non- 
catalytic subunit 


GTGGCCCA CTTGTTTAA TGG 





PRKAG3 

protein kinase, AMP- 

AF214521 

GGAGCAAATGTGCAGACAAG 

PCR-RFLP analyses: (MM) 

259 

55/3.0/E/M 



activated, gamma 3 non- 


CCCA CGAA GCTCTGCTTCTT 

(R200Q) e ; 





catalytic subunit 

AF214521 

CCACCAGCTCAGAAAGAAGC 

(Rsy/HI) (I199V) e,f 

PCR-RFLP analyses: 

124 

59/3.0/E/M 





CCA GAA CTGGCCTTGAA CTC 

(%I) (T30N) e,f ; 

{HphY) (G52S) f 






AF214521 

TGGAAGAGGGAGTTACTGTGC 

CA GTCA GA GGGGA GGA TGTC 

SSCP analysis (L53P) e 

180 

61/2.5/R/M 

Muscle glycogen 

GAA 

glucosidase, alpha; acid 

BE235102** 

AGGACATGGTGGCTGAGTTC 

SCHP analysis, SSCP analysis 

229 

62/1.5/S/M 

metabolism 


(Pompe disease, 

(NM_000152, 

CA CGTAA GGA GGGTTCTCCA 






glycogen storage disease 
type II) 

e-123, 85%) 






GYS1 

glycogen synthase 1 

AW786848** 

CTACCCACGGCCAGCATC 

STS 

342 

58/1.5/S/M 



(muscle) 

(NM_002103, 
5e-39, 87%) 

CCTCCTCA TCCTCA CTCTGG 







AJ557224 

GCTAGAAGGGTTGGGTAGGG 

CA CTCTGGTGTGGA CTC GAA 

SCHP analysis, SSCP analysis 

240 

67/2.5/R/P 


PYGM 

phosphorylase, 

Z98787* 

CCCAAACCTGCCCCTGAGTC 

SCHP analysis, SSCP analysis, 

150 

60/1.5/R/P 



glycogen; muscle 
(McArdle syndrome, 
glycogen storage disease 
type V) 


CCTTCGTCGCGCTGAA CA CT 

PCR-RFLP analysis (7^1) 



Muscle glycolysis 

ALDOA 

aldolase A, fructose- 

BF188921** 

ATGGGGACCATGACTTGAAA 

STS, SCHP analysis, SSCP 

234 

60/1.5/R or 



bisphosphate 

(NM 000034, 
0.0,91%) 

TA TTGGGCTTCA TCAA GGTG 

analysis 


S/M 


GPI 

glucose phosphate 

Z28398 

GTGTCCGGAGTGGCGAATG 

PCR-APLP analysis 8 

362/382 

62/2.0/E/M 



isomerase 


GGA GGCAA TGA TGAA CA GGGA C 8 





LDHA 

lactate dehydrogenase A 

U07178, 

GCCAGTGTTGCAGATGCT 

(3’-UTR) SSCP analysis 

228 

63/1.5/R/M 




AJ301275* 

A CA TGGCA TTGTA CA CTA TTCTG h 







U07178 

CTATAACGTGACTGCAAACTCTAGG 

(exon 3) SSCP analysis, PCR- 

150 

57/3.5/E/M 





TTGCA GTTTGGGCTGTA TTTT 

RFLP analysis (HpyC H4V) 




PKM2 

pyruvate kinase, muscle 

AJ301024* 

AGGCGGCTGCAGTAGTCG 

SSCP analysis 

128 

58/1.5/R or 





CCCCTTA GCCTCCCTCA CTC 



S/M 


PGAM2 

phosphoglycerate 

Z98802* 

GGAGCTGAACCTGCCCACAG 

SSCP analysis 

172 

61/1.5/S/P 



mutase 2 (muscle) 

Z98802* 

GGTTGCCTTTA TTGCCGA GCC 
CCATCGTGTACGAGCTGGA 

GGGTAA GGTTGCCTTTA TTGC 

PCR-RFLP analysis (Cfol) 

152 

58/1.5/R/P 


EMBL/GenBank accession nos. of sequences used to derive heterologous (for PRKAA2 and PRKAB1) or homologous primers (for the other loci). *, sequences of partial cDNA 
isolated from a porcine skeletal muscle cDNA library (Davoli et al., 1999, 2002). **, sequences of ESTs identified by database mining using the human cDNA sequence indicated 
between the brackets with e-value and % of identity. 
b The reverse primer is indicated in italics. 

Use of the primers. STS, sequence tagged site characterization. For PCR-RFLP analyses, the restriction enzyme used is reported in parenthesis. ForPRKAG3 is also indicated the 
mutation analysed by PCR-RFLP or PCR-SSCP following the nomenclature reported by Milan et al. (2000) and Ciobanu et al. (2001). 

d PCR conditions: Optimal annealing temperature/[MgCl 2 ]/Taq DNA polymerase: E, EuroTaq (from EuroClone Ltd., Paignton, Devon, UK); R, (from Roche Diagnostics, Mannheim, 
Germany); S (from Sigma Aldrich, St. Louis, MO, USA)/thermal cycler: M, PTC-100 (MJ Research, Watertown, MA, USA); P, Perkin Elmer 9600 (Perkin Elmer, Roche Molecular 
System, Branehburg, NJ, USA). 

Polymorphism described by Milan et al. (2000). 

1 Polymorphism described by Ciobanu et al. (2001). 
g PCR primers and polymorphism described by Jiang and Gibson (1998). 
h PCR primers reported by Fridolfsson et al. (1997). 


er alleles of the PRKAG3 gene have been shown to influence 
these meat quality parameters (Ciobanu et al., 2001). There¬ 
fore, it can be also supposed that other genes with small- 
medium effect may be involved in the determination of GP 
and of the technological parameters of the meat that are corre¬ 


lated with this measure. As these traits can be affected by alter¬ 
ations of the glucose metabolism in skeletal muscle, the en¬ 
zymes and regulators involved in the biochemical pathways 
that accumulate or deplete glycogen and glucose in this tissue 
can be considered candidate genes for GP. 
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Here we studied several genes involved in the regulation of 
the energy balance, glycogen metabolism and glycolysis of the 
skeletal muscle with the aims to add some of these candidate 
genes to the linkage and physical maps of the porcine genome 
and to test some DNA markers in association studies with meat 
quality traits. 

Materials and methods 

PCR, identification and analysis of mutations 

Amplification of fragments of the candidate genes was obtained using 
primers designed on partial cDNAs obtained from a porcine skeletal muscle 
cDNA library (Davoli et al., 1999; 2002), using primers designed on 
expressed sequence tags (ESTs) identified from database mining performed 
with BLASTN (Altschul et al., 1997) and the corresponding human gene 
sequences or using primer pairs chosen on porcine gene sequences described 
in literature and available in DNA databases (Table 1). Searches of muta¬ 
tions in the amplified fragments of the ALDOA, GAA, GYS1, LDHA, 
PGAM2, PKM2, PRKAA2, PRKAB1, PRKAB2 and PYGM loci were per¬ 
formed by SSCP analysis as described by Fontanesi et al. (2001). Sequencing 
of these fragments was carried out both to confirm that the amplicons 
obtained corresponded to the expected gene regions as well as to characterize 
the mutations revealed by SSCP analysis. Sequencing reactions were ob¬ 
tained with [a 35 S]-dATP using the Cycle Sequencing kit (Applied Biosys¬ 
tems, Foster City, CA, USA) or with BigDye Terminator cycle sequencing 
chemistry v2.0 (Applied Biosystems). 

When the identified mutations created/disrupted a restriction site, a 
PCR-RFLP method was established (Table 1). For GPI, the amplified prod¬ 
uct length polymorphism (APLP) described by Jiang and Gibson (1998) was 
analyzed. Five mutations described at the PRKAG3 locus by Milan et al. 
(2000) and Ciobanu et al. (2001), namely T30N, G52S, L53P, 1199V and 
R200Q, were analyzed by PCR-RFLP or PCR-SSCP as reported in Table 1. 
PCR-RFLP and PCR-APLP products were resolved on 10% polyacryl- 
amide/bis-acrylamide 29:1 gels stained with ethidium bromide. Allele fre¬ 
quencies for the polymorphic sites identified and analyzed were studied in 
samples of unrelated pigs of Large White (LW), Landrace (L), Duroc (D), 
Belgian Landrace (BL), Hampshire (H), Pietrain (P) and Meishan (M) 
breeds. 

Linkage and physical mapping 

DNA samples belonging to three-generation families of the PiGMaP 
Consortium (Archibald et al., 1995) were genotyped for the polymorphisms 
identified at the GAA, LDHA (exon 3 mutation), PGAM2, PKM2, PRKAB1 
and PYGM loci. The genotypes were merged with those present in the Res- 
Pig database (http://www.resSpecies.org). Twopoint and multipoint proce¬ 
dures of the CRI-MAP package version 2.4 (Green et al., 1990) were per¬ 
formed. Multipoint sex-averaged maps were constructed using options ALL, 
BUILD, CHROMPIC and FLIPS2-6. 

Physical mapping was attempted for the ALDOA, GAA, GYS1, 
PRKAA2, PRKAB1 and PRKAB2 genes screening by PCR the French 
somatic cell hybrid panel (SCHP; Yerle et al., 1996) consisting of 27 rodent- 
porcine hybrid cell lines. PCR reactions were visualized on 10% polyacryl- 
amide/bis-acrylamide 29:1 gels stained with ethidium bromide. Evaluation 
of the PCR results of the SCHP was performed by means of software 
described by Chevalet et al. (1997) and accessible at the WWW INRA server 
(http://www.toulouse.inra.fr/lgc/lgc.html). 

Meat quality trait measures and association studies 

Among 507 commercial pigs subjected to an experiment with transport 
condition controlled and genotyped at the RYR1 locus (Nanni Costa et al., 
1999), 2-4 animals for each class/treatment were selected for a total of 61 
pigs. All the selected animals did not carry the 1853T (n) allele at this locus. 
For these animals, pH and colour measurements (L* and a*) were taken on 
biceps femoris muscle at 1 h and 24 h post mortem. Glycogen, glucose, glu- 
cose-6-phosphate and lactate content were determined as reported by Nanni 
Costa et al. (1999) on the same muscle sampled at 1 h and 24 h post mortem. 
GP was calculated according to Monin and Sellier (1985) and expressed as 
pmol of lactate equivalent per gram of fresh muscle. The GP value of the 


selected animals ranged from 91.85 to 232.39 pmol at 1 h post mortem 
(mean ± s.d.: 166.69 ± 27.53) and ranged from 96.86 to 178.00 pmol at 24 h 
post mortem (mean ± s.d.: 136.05 ± 16.15). Drip and cooking losses were 
assessed on samples of longissimus thoracis muscle collected on the same 
pigs. The hams were delivered to a plant to be processed as Parma dry-cured 
hams and first salting loss and processing loss were determined (Nanni Costa 
et al., 1999). The selected animals were genotyped at the RN locus (R200Q at 
the PRKAG3 gene; Milan et al., 2000) and analyzed for the polymorphisms 
of the candidate genes investigated: GAA, GPI, LDHA ^'-untranslated 
region and exon 3 mutations), PGAM2, PKM2, PRKAB1, PRKAG3 (T30N, 
G52S, L53P and 1199V mutations) and PYGM. Association between geno¬ 
types at the candidate genes and meat technological parameters were evaluat¬ 
ed by means of the GLM procedure of SAS (1994). The model included the 
fixed effects of the genotype of these genes (one locus for each model run), 
pre-slaughter conditions (loading method, stocking density, truck deck) and 
sex and the random effect of day of slaughtering. P = 0.01 was considered as 
the threshold for significance. 


Results and discussion 

Isolation ofporcine gene fragments, identification and 

analysis of mutations in candidate genes 

New porcine sequence tagged sites (STS), including portions 
of introns and/or exons, were obtained for several genes using 
heterologous (PRKAA2 and PRKAB1) or homologous (GAA, 
GYS1 and ALDOA) primers. Indications of the amplified gene 
regions are reported in the corresponding EMBL entries (Ta¬ 
bles 1 and 2). 

SSCPs were observed for six of the investigated genes 
(PRKAB1; GAA; PYGM; LDHA, 3'-UTR and exon 3 region; 
PGAM2; PKM2). All the alleles identified at these loci have 
been characterized and sequence information is reported in 
Table 2. Sequence data of the PRKAB1, PYGM, LDHA (exon 
3 polymorphism) and PGAM2 alleles made it possible to set up 
a PCR-RFLP protocol to analyze these markers (Fig. la, c, e 
and f, respectively; Table 1) while for the GAA, LDHA (3 7 - 
UTR) and PKM2 loci the polymorphisms were analyzed by 
SSCP analysis (Fig. lb, d, g, respectively). 

For LDHA, Fridolfsson et al. (1997) already reported the 
presence of a biallelic SSCP in the 228-bp amplified fragment 
of the 3 / -UTR. We confirmed the presence of a polymorphism 
in this amplified fragment of the gene but as no sequence infor¬ 
mation was available for that marker it is not possible to know 
if the polymorphism that we found is the same previously 
reported. Allele frequencies were analyzed in seven different 
breeds for all these new described polymorphisms (Table 3). 
This investigation was also carried out to study the allele distri¬ 
bution in the same breeds for four other polymorphic sites in 
other two candidate genes (GPI and PRKAG3 T30N, G52S 
and 1199 V) for which mutations were already described in liter¬ 
ature (Jiang and Gibson, 1998; Milan et al., 2000; Ciobanu et 
al., 2001) (Table 3). Allele frequency differences can be ob¬ 
served between breeds for some of these markers. 

Linkage and physical mapping 

The new polymorphisms identified at the PRKAB1, GAA, 
PYGM, LDHA (exon 3 mutation), PGAM2 and PKM2 loci 
made it possible to linkage map these genes (Table 2). PRKAB 1 
was placed on the genetic map of porcine chromosome (SSC) 
14 as follows: Sw857 - 27.3 cM - S0037 - 8.0 cM - CTSB - 
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Table 2. EMBL accession numbers of the sequences obtained, chromosomal localization of some investigated porcine loci obtained by linkage mapping 
and/or by SCHP analysis. The corresponding human chromosome location is indicated for comparative mapping. 


Loci 

EMBL accession nos. 

Linkage assignment 



SCHP assignment 


Human 


(description) 

Chr. 

2pt closest 

2pt LOD score 

No. of 

Chr. regions 

Statistical score 3 

chromosome 

localization 15 




marker 

(0) 

informative 








meiosis 




PRKAA2 

AJ557220 (STS) 

— 

— 

— 

— 

— 

— 

lp32.2 

PRKAB1 

AJ557221 (allele 1) 
AJ557222 (allele 2) 

14 

UBC 

11.76 (0.02) 

82 

14q22-q24 

0.80 */ 1 /0 

12q24.23 

PRKAB2 

AJ557223 (STS) 

— 

— 

— 

— 

4q21-q23 

0.88 **/ 1 /0 

1q21.1 

GAA 

AJ557225 (allele 1) 
AJ557226 (allele 2) 
AJ557227 (allele 3) 

12 

S0143 

12.37 (0.06) 

114 

12(2/3)p 13—p 11 

0.89 **/0/0 

17q25.3 

GYS1 

AJ557224 (STS) 

— 

— 

— 

— 

6>(q21) 

0.79 */0/0 

19ql3.33 

PYGM 

AJ557228 (allele 1) 
AJ557229 (allele 2) 

2 

Sw256 

4.21 (0.00) 

28 

2pl7-pl4 c 

— 

11 q 13.1 

ALDOA 

AJ557230 (STS) 

— 

— 

— 

— 

3p17—p16 

0.87 ** / 2 / 0 

16pl 1.2 

LDHA 






2p 17—p 14 d 

— 

11 p 15.1 

(3'-UTR) 

AJ557231 (allele 1) 
AJ557232 (allele 2) 

— 

— 

— 

— 




(exon3) 

AJ557233 (allele 1) 
AJ557234 (allele 2) 

2 

S0091 

5.76 (0.08) 

58 




PGAM2 

AJ557237 (allele 1) 
AJ557238 (allele 2) 
AJ557239 (allele 3) 

18 

IGFBP3 

6.88 (0.06) 

103 

18q13—q21 c 


7pl3 

PKM2 

AJ557235 (allele 1) 
AJ557236 (allele 2) 

7 

SSTR1 

4.28 (0.05) 

45 

7ql2-q23; 7q26 e 


15q23 


1 Probability of the regional assignment with risk <0.1% **, <0.5% * / number of false positive / number of false negative. 
b Human chromosome location obtained from Ensembl Genome Server (http://www.ensembl.org/Homo_sapiens/). 
c SCHP mapping of the porcine gene obtained by Davoli et al. (2000). 
d SCHP mapping of the porcine gene obtained by Fridolfsson et al. (1997). 
e SCHP mapping of the porcine gene obtained by Davoli et al. (2002). 


11.2 cM - PRKAB1 - 2.8 cM - S0058 - 0.0 cM - UBC -. 
The physical assignment obtained analyzing the SCHP 
(SSC14q22->q24) confirmed the genetic localization (Table 2) 
and as the human PRKAB1 gene is mapped to human chromo¬ 
some (HSA) 12q24.23, these data agree with Zoo-FISH results 
between human and pig (Goureau et al., 1996). The map posi¬ 
tion of the porcine PRKAB 1 is close to the putative QTL 
regions for percent cooking loss and tenderness score identified 
on SSC14 (Malek et al., 2001). 

The porcine GAA was linkage mapped to SSC12 (SO 143 - 

6 .1 cM - GAA - 26.0 cM - PRKAR1A- 7.2 cM - S0083 -) and 
physical mapped to the region 12p 13(2/3) —> p 11 (Table 2). The 
human GAA gene is localized to HSA17q25.3 and the assign¬ 
ment of the porcine GAA gene add a new type I marker to the 
comparative map between SSC12 and HSA 17. 

The markers identified at the porcine PYGM and LDHA 
loci were placed on the linkage map of SSC2 (SOI 10 - 0.0 cM - 
Sw256 - 0.0 cM - PYGM - 7.3 cM - SO 141 - 16.0 cM - Sw240 
- 18.3 cM - FSHB - 7.7 cM - S0170 - 0.6 cM - LDHA - 
6.4 cM - INSR - 0.0 cM - S0091 -) and these data confirm the 
physical mapping of PYGM already reported by Davoli et al. 
(2000) and the genetic and physical mapping of LDHA ob¬ 
tained by Fridolfsson et al. (1997) (Table 2). The localization of 
LDHA is close to the putative QTL for drip loss and water 
holding capacity identified on SSC2 by Malek et al. (2001). 

PGAM2 was genetically assigned to SSC18 (PGAM2 - 

5.2 cM - IGFBP3 - 6.2 cM - SO 120 - 1.3 cM - S0062) and this 
localization agrees with the SCHP assignment reported by 


Davoli et al. (2000) (Table 2). The map position of this gene is 
close to the putative QTL regions identified on SSC18 for meat 
colour and muscle pH (Malek et al., 2001; Paszek et al., 2001). 

PKM2 was placed on the linkage map of SSC7 (- S0066 - 
4.8 cM - SSTR1 - 0.0 cM - ANPEP - 3.3 cM - MYH7 - 
9.0 cM - S0029 - 7.9 cM - SO 1 1 5 - 15.2 cM - PKM2 - 24.1 cM 
- POA1A - 0.0 cM - PI1 - 10.3 cM - PI - 6.7 cM - S0212) 
confirming the physical localization (Davoli et al., 2002) and 
the genetic mapping of this gene obtained using a human 
cDNA probe in RFLP analysis (Marklund et al. 1996) (Ta¬ 
ble 2). On SSC7, QTL or putative QTL for meat colour and 
muscle moisture have been identified in a region surrounding 
PKM2 (Malek et al., 2001; Paszek et al., 2001; Ovilo et al., 
2002 ). 

Three other genes (PRKAB2, GYS1 and ALDOA), for 
which no polymorphism was identified, were physically map¬ 
ped by SCHP analyses (Table 3). The physical mapping of these 
three genes agrees with the human porcine comparative map¬ 
ping data (Goureau et al., 1996). Statistical scores for the physi¬ 
cal localization of the PRKAA2 locus were not significant. 

Association studies 

The selected pigs were genotyped at the PRKAG3 R200Q 
site by means of a PCR-RFLP protocol with Mbil as restriction 
enzyme (Fig. lh). None of the tested pigs carried the 200Q 
allele even if some animals showed GP value at one hour post 
mortem higher than 180-183 pmol lactate equivalent g -1 mus¬ 
cle wet weight that was indicated as a general threshold to iden- 
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a 

u 11 11 12 12 12 22 22 


b 


c 

M 11 11 12 12 22 22 11 


l~ k-f Imd 


Allele 1 
Allele 2 


13 23 11 13 13 12 13 12 12 11 22 33 22 11 



Allele 1 
Allele 2 


Allele 3 



Allele 1 
(150 bp) 


-Allele 2 
(125 + 25 bp) 



M 22 12 11 11 11 22 11 12 


11 11 11 11 12 22 



Allele 1 
Allele 2 



Allele 1 
(150 bp) 

Allele 2 
(103 + 50 bp) 


f 


u 22 22 12 12 11 11 22 22 22 23 



Allele 3 (127 bp) 
Allele 2(104 bp) 

Allele 1 (77 bp) 



11 22 22 11 12 22 12 12 



Allele 1 
Allele 2 


M 


RR 




QQ 


RR 



Q (259 bp) 


h R (219 + 40 bp) 


Fig. 1. (a) PCR-RFLP at the PRKAB1 locus; (b) SSCP at the GAA locus; (c) PCR-RFLP at the PYGM locus; (d) SSCP at the 
LDHA locus (3MJTR); (e) PCR-RFLP at the LDHA locus (exon 3), the 50-bp fragment of allele 2 is not shown in the gel; 
(f) PCR-RFLP at the PGAM2 locus, only the main fragments that differentiate the three alleles are indicated; (g) SSCP at the 
PKM2 locus; (h) PCR-RFLP at the PRKAG3 R200Q polymorphic site. The genotypes of the corresponding gel lines are indicated 
at the top of each figure. M: molecular weight marker VIII (Roche Diagnostics), u: undigested product. 


tify the carriers of the RN" allele (Lundstrom et al., 1996; Enfalt 
et al., 1997). Therefore, high GP in some animals, not ex¬ 
plained by the presence of the 200Q allele, may suggest that 
other genetic factors could influence this parameter in different 
pig populations. 

Then, as a first attempt to identify DNA markers associated 
with meat quality traits, the new polymorphisms identified in 
this study, the PCR-APLP described at the GPI locus (Jiang 
and Gibson, 1998) and the other polymorphic sites at the 
PRKAG3 locus (T30N, G52S, L53P and 1199V mutations; 
Milan et al., 2000; Ciobanu et al., 2001) were analyzed in the 
selected pig population. For three markers (LDHA 3 / -UTR 
marker, PYGM and PRKAG3 P53L) all the selected pigs were 
homozygous. The results of the association study are shown in 
Table 4. Significant associations (P < 0.001) were observed for 
two polymorphic sites at the PRKAG3 (T30N and G52S) with 


L* at 24 h. Meat from pigs with genotype TT and GG was light¬ 
er compared to genotype NN and SS, respectively. No signifi¬ 
cant result was evidenced for PRKAG3 1199V that was de¬ 
scribed to have a major effect on muscle glycogen, lactate and 
GP values as well as on pH and colour (Ciobanu et al., 2001). 
However, it should be considered that only two animals were 
identified for genotype II. Association analysis performed using 
the haplotypes at this gene for the animals for which it was pos¬ 
sible to infer the phase of the three sites confirmed a significant 
effect of this locus on L* (data not shown). The two genotypes 
identified at the PGAM2 locus showed significant effects on 
drip loss. Animals with genotype 22 showed higher liquid loss 
that was confirmed by the data, even if not significant, of cook¬ 
ing losses, first salting loss and curing losses (data not shown). 
PKM2 was significantly associated to glycogen content at one 
hour post mortem with genotype 11 that showed a reduced lev- 
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Table 3. Allele frequencies of the genotyped 
markers in unrelated pigs of the LW, L, D, BL, 
H, P and M breeds. The numbers of animals 
analysed for each breed/locus are indicated in 
parentheses. 


Loci 

Alleles 

Breeds 









LW 

L 

D 

BL 

H 

P 

M 

PRKAB1 

7 

0.34 

0.03 

0.02 

0.00 

0.05 

0.00 

0.11 


2 

0.66 

0.97 

0.98 

1.00 

0.95 

1.00 

0.89 



(72) 

(57) 

(61) 

(32) 

(29) 

(25) 

(9) 

PRKAG3 

T 

0.82 

0.87 

0.31 

0.83 

0.98 

0.95 

0.96 


N 

0.18 

0.13 

0.69 

0.17 

0.02 

0.05 

0.04 



(37) 

(20) 

(32) 

(33) 

(29) 

(38) 

(13) 


G 

0.54 

0.92 

0.86 

0.80 

0.95 

0.53 

0.77 


S 

0.46 

0.08 

0.14 

0.20 

0.05 

0.47 

0.23 



(37) 

(20) 

(32) 

(33) 

(29) 

(38) 

(13) 


I 

0.20 

0.55 

0.14 

0.50 

0.29 

0.28 

0.00 


V 

0.80 

0.45 

0.86 

0.50 

0.71 

0.72 

1.00 



(37) 

(20) 

(32) 

(32) 

(28) 

(37) 

(8) 

GAA 

1 

0.00 

0.04 

0.00 

0.05 

0.00 

0.02 

0.94 


2 

0.60 

0.53 

0.81 

0.24 

0.71 

0.72 

0.06 


3 

0.40 

0.43 

0.19 

0.71 

0.29 

0.26 

0.00 



(73) 

(56) 

(61) 

(33) 

(29) 

(36) 

(9) 

PYGM 

1 

1.00 

1.00 

1.00 

1.00 

1.00 

1.00 

0.77 


2 

0.00 

0.00 

0.00 

0.00 

0.00 

0.00 

0.23 



(46) 

(8) 

(10) 

(10) 

(15) 

(11) 

(11) 

GPI 

1 

0.46 

0.42 

0.66 

0.03 

0.00 

0.12 

0.00 


2 

0.54 

0.58 

0.34 

0.97 

1.00 

0.88 

1.00 



(37) 

(20) 

(32) 

(32) 

(27) 

(36) 

(9) 

LDHA (3'-UTR) 

1 

1.00 

1.00 

0.97 

1.00 

0.67 

1.00 

1.00 


2 

0.00 

0.00 

0.03 

0.00 

0.33 

0.00 

0.00 



(21) 

(17) 

(30) 

(16) 

(29) 

(9) 

(8) 

(exon 3) 

1 

0.85 

0.90 

0.90 

0.87 

0.98 

0.87 

0.79 


2 

0.15 

0.10 

0.10 

0.13 

0.02 

0.13 

0.21 



(44) 

(20) 

(31) 

(28) 

(25) 

(41) 

(12) 

PGAM2 

1 

0.16 

0.11 

0.93 

0.12 

0.41 

0.31 

0.89 


2 

0.84 

0.89 

0.07 

0.80 

0.59 

0.69 

0.11 


3 

0.00 

0.00 

0.00 

0.08 

0.00 

0.00 

0.00 



(73) 

(57) 

(61) 

(33) 

(29) 

(35) 

(9) 

PKM2 

1 

0.47 

0.60 

0.37 

0.79 

0.50 

0.58 

0.06 


2 

0.53 

0.40 

0.63 

0.21 

0.50 

0.42 

0.94 



(73) 

(57) 

(61) 

(33) 

(29) 

(36) 

(9) 


Table 4. Association results. Least square means (LSM) estimated for 
each polymorphism are indicated with their standard error (SE). Significant 
differences (within a trait) between the genotype classes are indicated with 
different superscript letter: a > b P < 0.01; A > B P < 0.001. 


Traits 

Loci 

Genotypes (no. of pigs) 

LSM (SE) 

L* 24 h 

PRKAG3 

TT (22) 

54.20(1.37) A 



TN (34) 

50.59 (0.54) B 



NN (5) 

47.67 (0.71) B 


PRKAG3 

GG (30) 

50.97 (0.65) A 



GS (22) 

50.29 (0.70) A 



SS (9) 

46.24 (1.05) B 

Drip loss (%) 

PGAM2 

12 (34) 

3.35 (0.22) a 



22 (27) 

4.15 (0.24) b 

Glycogen 1 h 

PKM2 

77(11) 

51.01 (4.54) a 



72 (30) 

63.29 (3.08) a ’ b 



22 (20) 

68.29 (3.50) b 


el of this metabolite. GP at one hour, even if not significant 
(data not shown), confirmed the trend observed for glycogen 
content. 

Conclusions 

In this study we considered 12 candidate genes for glycolytic 
potential and meat quality traits in pigs. On the whole we iden¬ 


tified 11 new SNPs in six of these genes (PRKAB1, GAA, 
PYGM, LDHA, PKM2 and PGAM2) that were assigned by 
linkage mapping to five different pig chromosomes. Two of 
these genes (PRKAB1 and GAA) were also physically mapped 
together with three other genes (PRKAB 1, GYS1 and ALDOA) 
for which no polymorphism was identified. These assignments 
contribute to increase the type I marker maps of the porcine 
genome confirming the human-porcine comparative mapping 
data. Two other loci (GPI and PRKAG3) were also investi¬ 
gated using polymorphisms already described. Eleven DNA 
markers were analyzed in several pig breeds and these data 
could be used in population genetic studies. 

Moreover, we observed that high value of GP is not due to 
the presence of the RN _ allele (200Q) in the pigs that we ana¬ 
lyzed. However, our data may confirm the effect of other muta¬ 
tions at the PRKAG3 locus (T30N and G52S) on meat quality. 
Significant associations have been observed between PGAM2 
and drip loss and between PKM2 and muscle glycogen content. 
These data represent interesting indications and are worth fur¬ 
ther investigation to confirm these preliminary results. 
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Abstract. Genes coding for sarcomeric proteins may play a 
key role in muscle mass accretion and meat production. Screen¬ 
ing a skeletal muscle cDNA library we isolated two partial 
sequences coding for the sarcomeric myopalladin and titin 
genes. In the present work we identified three SNPs in the 3 ' 
untranslated region, two at the myopalladin locus and one at 
the titin locus. Myopalladin was mapped on porcine chromo¬ 
some (SSC) 14 using a somatic cell hybrid panel, a radiation 
hybrid panel and by linkage mapping. The linkage mapping of 


titin confirmed the position on SSC 15. Then we analysed the 
allelic distribution of the alleles at both loci in six different por¬ 
cine breeds. The analysis of the allele frequencies for these two 
loci in extremely divergent groups of pigs selected according to 
lean cuts (LC) and average daily gain (ADG) approached the 
significance level for myopalladin and LC trait. Further studies 
are needed to test the presence of a putative effect of myopalla¬ 
din on lean meat content. 

Copyright©2003 S. Karger AG, Basel 


Skeletal muscle genes are potential candidates for produc¬ 
tion and meat quality. Genes coding for sarcomeric proteins 
may play key roles in muscle mass accretion and could 
influence meat production. In pigs, several studies have been 
focused on fibre types, structural muscle genes and their possi¬ 
ble relationship with qualitative and quantitative characteris¬ 
tics of meat (Lefaucher and Gerrard, 2000; Eggert et al., 2002; 
Chang et al., 2003a, 2003b; Davoli et al., 2003). 

In order to get more insight into the muscle specific genes, 
Davoli et al. (1999; 2002) prepared and analysed a porcine skel¬ 
etal muscle cDNA library, from which more than 700 ex¬ 
pressed sequence tags (ESTs) have been isolated. Recently, 
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from this cDNA library we isolated partial cDNAs coding for 
myopalladin and titin, two important porcine skeletal muscle 
proteins involved in sarcomere and myofibril assembly. 

Myopalladin (MYOP or FLJ14437) is a 145-kDa sarcomer¬ 
ic protein, which tethers together a-actinin with nebulin in skel¬ 
etal muscle. It has central roles in the organization and assem¬ 
bly of the Z-line (Bang et al., 2001b) and seems to be involved 
in regulatory mechanisms of muscle gene expression (Ma and 
Wang, 2002). 

Titin (TTN), also known as connectin, with a relative 
molecular mass of more than 3,000 kDa, is the largest known 
protein. Its molecules are stringlike and in vivo span from the 
Z-disk, where the N-terminus of TTN is located, to the center 
of the sarcomere in the M-line. TTN is the third most abundant 
protein in striated muscles, after myosin and actin. This giant 
protein is thought to be involved in muscle assembly and ultra¬ 
structure and it is also implicated in muscle elasticity (Keller, 
1995; Maruyama, 1997). The human gene contains 363 exons 
and numerous differentially spliced isoforms have been identi¬ 
fied in vertebrates (Bang et al., 2001a). 

In the present work we studied these two loci with the aim to 
identify DNA markers that could be used in association with 
studies on meat production traits in pig. 
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Materials and methods 

PCR, identification of mutations and allele frequencies 

PCR primers for MYOP (forward: 5'-G ACTGGTGCATGCATTGTA- 
GA-3'; reverse: 5-AAACCTGCCTCTCCGCTTA-3') amplified a 153 bp- 
fragment of the 3 / -untranslated region (3 / -UTR) of the gene (EMBL ace. no. 
AJ560657). PCR primers for TTN (forward: 5'-C AGAGC AGTGCC AAC- 
TCTTG-3'; reverse: 5 / -TCAAATTGATATTCTGGGCAGTT-3 / ) were de¬ 
signed in the 3 / -UTR (EMBL ace. no. AJ560658) and amplified a fragment of 
195 bp. PCR was performed in a total volume of 20 pi using 50-100 ng of 
porcine genomic DNA, 10 pmol each primer, 250 pM each dNTP, 2.0 
(MYOP) or 2.5 (TTN) mM MgCl 2 and 1 U Taq polymerase (Roche Diagnos¬ 
tics Mannheim, Germany for MYOP; AmpliTaq Gold DNA Polymerase 
Applied Biosystems Foster City U.S.A. for TTN). Amplification was carried 
out using a 9600 Perkin Elmer DNA thermal cycler with the following pro¬ 
file: an initial denaturation step at 95 °C for 5 min for MYOP or 10 min for 
TTN followed by 35 cycles of 30 s at 95 °C, 30 s at 60°C (MYOP) or 53 °C 
(TTN) and 30 s at 72 0 C. A final extension step at 72 0 C followed for 5 min. 

PCR products from 19 pigs belonging to different pig breeds were ana¬ 
lysed to search for single strand conformation polymorphisms (SSCPs) 
according to the protocol described by Fontanesi et al. (2001). Sequencing of 
the PCR fragments was performed after purification with Microcon YM-50 
columns (Amicon, Millipore Corporation, Bedford, MA), using the same 
PCR primers and the BigDye Terminator cycle sequencing chemistry v2.0 
(Applied Biosystems). Sequencing reactions were loaded in an ABB 100 
sequencer (Applied Biosystems) and sequence data were analysed with the 
Chromas software (version 1.45, McCarthy, Griffith University, Southport, 
Australia). 

PCR-RFLP was performed for the MYOP amplicons using 5 pi of each 
PCR and 5 U of Rsal in a total volume of 25 pi. Digestion products were 
electrophoresed on 10% polyacrylamide gels and stained with ethidium bro¬ 
mide. For TTN it was not possible to develop a PCR-RFLP protocol, so 
further analyses were carried out using the SSCP protocol as reported 
above. 

Allele frequencies at the MYOP and TTN loci were studied in a sample 
of 135 and 127 pigs, respectively, belonging to six different pig breeds (Large 
White, Landrace, Duroc, Belgian Landrace, Hampshire and Pietrain). 

Mendelian inheritance of the polymorphisms and linkage mapping 

DNA samples of pigs belonging to three and six three-generation families 
of the PiGMaP Consortium (Archibald et al., 1995) were genotyped at the 
TTN and MYOP locus, respectively, to check the codominant inheritance of 
the polymorphism of these genes. Twopoint and multipoint procedures of 
the CRI-MAP package version 2.4 (Green et al., 1990) were performed by 
merging the MYOP and TTN genotypes with the PiGMaP Consortium Res- 
Pig database (http://www.resSpecies.org). A multipoint sex-averaged map 
containing MYOP was constructed using options ALL, BUILD, CHROM- 
PIC and FLIPS2-6. 

Somatic cell hybrid and radiation hybrid mapping of MYOP 

The DNA of the French somatic cell hybrid panel (SCHP) (Yerle et al., 
1996) and of the INRA-Minnesota 7,000-rad radiation hybrid panel 
(IMpRH panel) (Yerle et al., 1998) consisting of 27 and 118 rodent-porcine 
hybrid cell lines, respectively, was amplified using the MYOP primers alrea¬ 
dy reported. No PCR fragment was obtained from the control rodent 
genomic DNA. The PCR products were visualized on 10% polyacrylamide/ 
bis-acrylamide 29:1 (SCHP) or 2 % agarose gels (IMpRH panel). 

Evaluation of the somatic cell hybrid panel PCR results was performed 
by means of software described by Chevalet et al. (1997) and accessible at the 
WWW INRA server (http://www.toulouse.inra.fr/lgc/lgc.html). 

The results of radiation hybrid PCR products were analysed with the 
IMpRH mapping tool developed by Milan et al. (2000) and accessible 
through the http://imprh.toulouse.inra.fr/ web address. The multipoint loca¬ 
tion of the MYOP gene was obtained using minimum breakage criteria (Mi¬ 
lan et al., 2000). 

Analysis of allele frequencies in extreme divergent groups of pigs 

Estimated breeding values (EBVs) for economically important traits cal¬ 
culated by the National Association of Pig Breeders (Associazione Nazionale 
Allevatori Suini, ANAS; http://www.anas.it) were available for 3,591 Large 
White pigs used for sib-test during the period 1996-1999. Among these ani- 
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93bp\ 
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Fig. 1. SSCP (a) and PCR-RFLP (b) at the MYOP locus. The genotypes 
are indicated at the top of each lane. M: molecular weight marker VIII 
(Roche Diagnostics Mannheim, Germany). 


mals, 100 pigs with extreme divergent EBVs (50 with the highest and 50 with 
the lowest values) for average daily gain (ADG; calculated from 30 kg to 
155 kg of live weight with a quasi ad libitum feed intake) were selected. 
Another group of 100 animals with extremely divergent EBVs (50 with the 
highest and 50 with the lowest values) for weight of lean cuts (LC), that 
included neck and loin, were selected within the same group of 3,591 Large 
White animals. EBV means ± s.d. for the two extreme groups (positive and 
negative) of chosen pigs for the two investigated traits were as follow: 
+ 104.36 ± 13.01 g and -60.34 ± 13.12 g for ADG; +5.95 ± 0.49 kg and 
-3.74 ± 0.53 kg for LC. 

DNA of the selected pigs was extracted from lyophilized blood using a 
standard protocol (Sambrook et al., 1989). Then, these animals were geno¬ 
typed at the MYOP and TTN loci using the protocols described above. Fish¬ 
er’s exact test of significance (two tailed) of differences of allele frequency 
between the positive and negative groups was calculated for each trait using a 
stringent threshold for significance {P < 0.01), in order to take account for 
multiple testing. This was applied considering all the animals (50 pigs for 
each tail), only two-generation unrelated pigs or only three-generation unre¬ 
lated pigs among the chosen extreme animals (Table 2). 


Results and discussion 

Identification and analysis of single nucleotide 
polymorphisms 

SSCP analysis of the 153-bp fragment of the MYOP gene 
revealed the presence of a biallelic polymorphism in a sample 
of pigs from different breeds (Fig. la). Sequencing of two 
homozygous 1/1 and two homozygous 2/2 animals showed the 
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Table 1. Allele frequencies at the MYOP and 
TTN loci in different pig breeds 


Breeds Allele frequencies 



No. of 
animals 

MYOP 


No. of 

TTN 


Allele 1 (T) 

Allele 2 (C) 

animals 

Allele 1 (T) 

Allele 2 (C) 

Large White 

39 

0.85 

0.15 

39 

0.87 

0.13 

Landrace 

27 

0.52 

0.48 

20 

0.83 

0.17 

Duroc 

30 

0.52 

0.48 

30 

0.42 

0.58 

Belgian Landrace 

16 

0.84 

0.16 

16 

0.84 

0.16 

Hampshire 

11 

0.64 

0.36 

10 

1.00 

0.00 

Pietrain 

12 

0.79 

0.21 

12 

0.75 

0.25 


Table 2. Allele frequencies and probability 
from Fisher’s two-tailed exact test of equal 
frequency in positive vs negative groups for the 
polymorphisms at the MYOP and TTN loci 


Trait 

Group 3 

MYOP 

— 



— 

TTN 



— 



No. b 

— 

Allele frequencies 

P 

No. b 

Allele frequencies 

P 




Allele 1 

Allele 2 



Allele 1 

Allele 2 


ADG 

P 

50 

0.89 

0.11 

0.271 

50 

0.98 

0.02 

0.067 


N 

50 

0.83 

0.17 

50 

0.90 

0.10 


P(u) 

18 

0.81 

0.19 

0.704 

18 

0.97 

0.03 

0.360 


N(u) 

37 

0.82 

0.18 

37 

0.90 

0.10 


P(u*) 

18 

0.81 

0.19 

0.689 

18 

0.97 

0.03 

0.353 


N(u*) 

35 

0.83 

0.17 

35 

0.90 

0.10 

LC 

P 

50 

0.93 

0.07 

0.028 

50 

0.96 

0.04 

0.089 


N 

50 

0.81 

0.19 

50 

0.89 

0.11 


m 

30 

0.92 

0.08 

0.051 

30 

0.93 

0.07 

0.624 


N(u) 

42 

0.79 

0.21 

42 

0.90 

0.10 


P(u*) 

N(u*) 

18 

31 

0.92 

0.77 

0.08 

0.23 

0.130 

18 

31 

0.92 

0.93 

0.08 

0.07 

0.804 


a P, positive EBV; N, negative EBV; P(u), positive EBV of two-generation unrelated pigs; N(u), negative EBV 
of two-generation unrelated pigs; P(u*) positive EBV of three-generation unrelated pigs; N(u*), negative EBV of 
three-generation unrelated pigs. 
b Number of pigs typed for each group. 




Allele 1 

Allele 2 


Fig. 2. SSCP identified at the TTN locus. The genotypes are indicated at 
the top of each lane. 


presence of two single nucleotide polymorphisms (SNPs) in 
position 77 (G-*T) and 97 (T-^C) of the amplified product of 
the MYOP gene. Allele 1 carries G and T at position 77 and 97, 
respectively, while allele 2 carries T and C at the same two posi¬ 
tions (EMBL acc. no. of the cDNA clone sequence AJ56065 7). 
The transition at position 97 creates/disrupts a restriction site 
for Rsal, thus a PCR-RFLP protocol with this endonuclease 
was used to analyse this marker of the porcine MYOP locus 
(Fig. lb). 


A search for mutations using the SSCP protocol to analyse 
the TTN amplicons allowed the identification of two alleles in 
the tested pig population (Fig. 2). Two 1/1 homozygous and 
two 2/2 homozygous pig DNAs were sequenced and sequence 
data provided evidence of an SNP (aC^T transition) at posi¬ 
tion 61 of the amplified product. Allele 1 carries C while allele 2 
carries T (EMBF acc. no. of the cDNA clone sequence 
AJ56065 8). This mutation, identified in the 3' UTR region, is 
different from that already reported in literature (Bertani et al., 
1999) that was identified between exon 3 and exon 5 of the 
porcine TTN gene. 

Allele frequencies at the MYOP and TTN loci were ana¬ 
lysed in a total of 135 and 127 pigs belonging to six different 
breeds (Table 1). Allele 1 at the MYOP locus was always the 
most frequent even if the distribution of the two alleles among 
the studied breeds was different. A similar distribution was also 
shown for allele 1 at the TTN locus except for the Duroc breed 
where the most abundant allele was the allele 2. 

Mapping 

TTN has already been physically mapped on porcine chro¬ 
mosome SSC15 by SCHP (Bertani et al., 1999; Davoli et al., 
2002) and by IMpRH (data reported in the Genetpig database: 
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Fig. 3. Physical (a), genetic (b) and RH (c) localisation of the porcine MYOP locus on SSC14. 


Karsenty et al., 2003). The SSCP marker identified at the TTN 
locus was used to confirm the linkage mapping on SSC15 (Ber- 
tani et al., 1999). A total of 20 informative meioses was 
obtained and the results of the twopoint analysis revealed asso¬ 
ciation of TTN with the following loci already identified on 
porcine chromosome 15: DPP4 (0 = 0.00, LOD = 6.02); EAG 
(0 = 0.10, LOD = 3.20); S0008 (0 = 0.00, LOD = 5.42); S0148 
(0 = 0.10, LOD = 3.20); S0284 (0 . = 0.00, LOD = 5.42). 

Linkage mapping of the MYOP locus was obtained with 50 
informative meioses. The twopoint procedure revealed associa¬ 
tion of MYOP with the following loci already identified on 
porcine chromosome 14: SW295 (0 = 0.02, LOD = 8.13); GGT 
(0 = 0.02, LOD = 7.54); S0007 (0 = 0.03, LOD = 6.12); ACTAI 
(0 = 0.00, LOD = 6.02); S0063 (0 = 0.03, LOD = 5.27); ATP2A2 
(0 = 0.07, LOD = 5.19); S0058 (0 = 0.09, LOD = 4.59); S0037 
(0 = 0.00, LOD = 4.52); S0162 (0 = 0.00, LOD = 4.52); S0166 
(0 = 0.00, LOD = 4.52); SW210 (0 = 0.03, LOD = 4.37). The 
SCHP data assigned MYOP to two regions of SSC 14 with the 
same probability (0.45 + 0.45), namely ql5 —>ql6 or q25-*q29 
(correlation = 0.87 and 0.85, respectively; error risk < 0.5%; 
positive/negative discordants 1/0 or 0/1, respectively). A more 
precise localization of MYOP was obtained with the IMpRH 
panel. The retention fraction of the MYOP amplicon in the RH 
panel was 27% and the closest locus identified by twopoint 


analysis was SWR1113 (distance = 74 cR; LOD = 5.03). On 
SSC 14 the locus was localized between microsatellites SW1425 
and SWR1113. Multipoint sex averaged analysis placed the 
porcine MYOP gene between ACTN2 (actinin alpha 2) and 
PLAU (plasminogen activator urokinase), closely linked to 
ACTAI (actin alpha 1 skeletal muscle) on the genetic map of 
SSC 14. According to these results the most probable SCHP 
localisation of MYOP could be the region 14q25—>q29 
(Fig. 3). 

Comparative mapping information for this locus was avail¬ 
able only for human. The human MYOP gene has been 
assigned to chromosome 10q22.2 confirming the conservation 
of synteny between a fragment of SSC 14 and HSAlOq already 
identified by Zoo-LISH experiments (Goureau et al., 1996). 

Allele frequencies in extreme divergent groups of pigs 

We used the PCR-RLLP marker identified at the MYOP 
locus and the TTN SSCP marker to evaluate if differences of 
allele frequencies occurred in extremely divergent groups of 
pigs selected according to two important parameters in pig 
breeding selection: lean cuts (LC) and average daily gain 
(ADG). In particular, MYOP and TTN could be considered 
candidate genes related to these two meat production traits 
according to gene function, muscle-prevalent expression, map 
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position and QTL data. Indeed, on SSC14, where MYOP is 
located, there are indications of the presence of QTLs for sever¬ 
al meat quality traits including loin weight and daily gain 
(Rohrer and Keele, 1998). 

Allele 1 at the MYOP locus was the most frequent in the 
sample of positive EBVs animals for both traits. The compari¬ 
son of the allele frequencies between the positive and negative 
groups of pigs selected for LC EBVs gave a two tailed Fisher’s 
exact test close to the chosen threshold for significance consid¬ 
ering the complete set of animals (P = 0.028; Table 2). How¬ 
ever, the P values for the two sets of extreme unrelated pigs (P = 
0.051 for two-generation unrelated pigs, P = 0.130 for three- 
generation unrelated pigs) did not show a significant difference 
between the extreme groups of animals. It is interesting to note 
that allele 1 which is generally most frequent in Large White 
animals with positive EBVs for LC, is also the most frequent 
allele in Pietrain and Belgian Landrace (Table 1) that are 
known to be muscling breeds. 

Moreover a comparison between the allele frequencies of 
the Large White population and the positive EBVs for LC, 
allowed us to note that the allele 1 value is higher for the latter. 
This could be an effect of the selection for this trait, with a 
putative effect on LC of the most frequent allele in this breed. 


Allele 1 at the TTN locus was in general more frequent in 
the positive EBV groups of animals for ADG and LC but no 
significant allele frequency difference was observed (Table 2). 

Conclusions 

The analysis of the partial 3' UTR of both genes investigated 
showed the presence of three SNPs, one for TTN and two for 
MYOP, that could be used for association studies with meat 
production traits. MYOP was assigned to SSC14, according to 
three different mapping methods (SCHP, IMpRH and linkage 
mapping) and TTN was confirmed on SSC15 by linkage map¬ 
ping. 

The specific gene function and the results obtained from the 
analysis of the distribution of the two different alleles of MYOP 
are worth carrying on further investigation to better evaluate 
the presence of a putative effect of MYOP using larger samples 
of unrelated pigs with extreme EBVs for lean cuts. 
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Abstract. In 1995, Edfors-Lilja and coworkers mapped the 
locus for the E. coli K88ab (F4ab) and K88ac (F4ac) intestinal 
receptor to pig chromosome 13 (SSC13). Using the same family 
material we have refined the map position to a region between 
the microsatellite markers Sw207 and Sw225. Primers from 
these markers were used to screen a pig BAC library and the 
positive clones were used for fluorescent in situ hybridization 
(FISH) analysis. The results of the FISH analysis helped to pro¬ 
pose a candidate gene region in the SSC13q41 —>q44 interval. 
Shotgun sequencing of the FISH-mapped BAC clones revealed 


that the candidate region contains an evolutionary breakpoint 
between human and pig. In order to further characterise the 
rearrangements between SSC 13 and human chromosome 3 
(HSA3), detailed gene mapping of SSC 13 was carried out. 
Based on this mapping data we have constructed a detailed 
comparative map between SSC 13 and HSA3. Two candidate 
regions on human chromosome 3 have been identified that are 
likely to harbour the human homologue of the gene responsible 
for susceptibility towards E. coli F4ab/ac diarrhoea in pigs. 

Copyright©2003 S. Karger AG, Basel 


Enterotoxigenic Escherichia coli cells (ETEC) that express 
the F4ab or F4ac fimbriae (formerly known as K88ab and 
K88ac) are major causes of diarrhoea and death in neonatal 
and young pigs (Wilson and Francis, 1986). In Denmark, 
ETEC F4 is present in around 25% of the reported diarrhoea 
cases (Ojeniyi et al., 1994). 

Sellwood and coworkers (1975) described two pig pheno¬ 
types in relation to ETEC F4, namely resistant and susceptible 
pigs. In 1977 Gibbons et al. showed that ETEC F4ac resistance 
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was inherited as an autosomal recessive Mendelian trait. It is 
assumed that the allele which confers susceptibility to ETEC F4 
infection either encodes a molecule that allows the bacteria to 
adhere to the intestinal tract or modifies such a molecule to 
allow binding. Linkage between ETEC F4ac locus and the 
transferrin locus (TF) was suggested and later confirmed (Gue¬ 
rin et al., 1993). Linkage mapping of the porcine loci responsi¬ 
ble for susceptibility towards ETEC F4ac and F4ab in a Wild 
Boar/Swedish Yorkshire intercross confirmed the segregation 
with TF and indicated a regional localization on pig chromo¬ 
some 13 (Edfors-Lilja et al., 1995). In 2002, Python and 
coworkers reported fine mapping of the receptor locus for 
ETEC F4ac, but no orthologous candidate gene region or candi¬ 
date gene was indicated or suggested. Currently, the only avail¬ 
able diagnostic test for this type of ETEC F4 resistance is the 
adhesion test developed by Sellwood et al. (1975). Since the 
adhesion test is very laborious and demands either major intes¬ 
tinal surgery or slaughter of the pig, it is difficult to include 
selection for ETEC F4 resistance in breeding programs. A 
molecular test for susceptibility/resistance to ETEC F4 based 
on the causal genetic variation at the ETEC F4 receptor locus or 
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at a linked genetic marker in linkage disequilibrium with the 
causal genetic variation would provide a quick, easy and precise 
means of genotyping living animals and thus marker assisted 
selection against susceptible genotypes. In the effort to develop 
a DNA-based test we report on the further characterisation of 
the chromosome region containing the gene responsible for 
resistance/susceptibility towards ETEC F4ab/ac and suggest a 
region corresponding to a portion of human chromosome 3. 

Materials and methods 

Animals 

In the linkage analysis we used the pedigree described by Edfors-Lilja et 
al. (1995). The parental generation comprised two European Wild boars each 
mated to four Swedish Yorkshire sows. The F t generation was intercrossed 
(four sires and 22 dams) to generate 200 F 2 offspring. In order to obtain large 
full-sib families, the matings were repeated and the offspring were conse¬ 
quently born in two parities. 

Adhesion test 

Epithelial cells from the upper part of the small intestine were obtained 
from specimens collected after slaughter of all animals. The adhesion test was 
performed by incubating the epithelial cells with E. coli F4ab and E. coli 
F4ac, respectively. Samples containing 10-20 cells were examined for adhe¬ 
sion of both E. coli F4ab and E. coli F4ac by interference contrast microsco¬ 
py. The results were scored from 1-4, where 1 = no bacteria and 4 = bacteria 
adhering to the whole brush border of all cells (Edfors-Lilja et al., 1986) 

Genotyping 

In order to improve the linkage map of SSC13 in the pedigree, sixty 
SSC13 microsatellite markers were selected from the USDA linkage map 
(Rohrer et al., 1996; http://www.marc.usda.gov/). These markers were ini¬ 
tially genotyped in the Fi-generation. One PCR primer from each of the 60 
markers was fluorescently labeled with either 6-FAM, HEX or TET (Applied 
Biosystems, Foster City, CA, USA) and PCR was carried out in a PE9600 
thermocycler (Applied Biosystems, Foster City, CA, USA) or an ABI877 (Ap¬ 
plied Biosystems, Foster City, CA, USA) using 10 pi reaction volume con¬ 
taining 25 ng genomic DNA, 1 x PCR buffer, 1.5-2.0 mM MgCh, 200 pM of 
each dNTP, 0.35 pM of each primer, and 0.25 units AmpliTaq Gold DNA 
polymerase (Applied Biosystems, Foster City, CA, USA). Thermocycling 
conditions were: pre-denaturation for 10 min at 95 °C, followed by 10 cycles 
with decreasing annealing temperatures (15 s at 95 °C, 30 s at 64-55 °C, 60 s 
at 72 °C), 25 cycles of reaction with a fixed annealing temperature (15 s at 
89°C, 30 s at 55 °C, 60 s at 72°C), and extension at 72°C for 1 h. PCR 
products were loaded on 4.25 % polyacrylamide denaturing sequencing gels, 
and run on an ABI PRISM 377 DNA sequencer. The results were analyzed 
using the GeneScan 2 software (Applied Biosystems, Foster City, CA, USA). 
Markers that amplified well, were easy to score and showed heterozygosity in 
the Fi-generation of the pedigree were selected for linkage mapping using all 
236 animals. Alleles were assigned and genotyping data were managed using 
the GEMMA software (Iannuccelli et al., 1996). 

In addition to the microsatellite markers, four gene-derived markers (CP, 
EST24F05, TF, and CAMP) were included in the existing SSC 13 map for 
this pedigree (Marklund et al., 1996). CP and TF are serum proteins. 
EST24F05 is a single stranded conformational polymorphism associated 
with the porcine dystroglycan (DAG1) locus and the polymorphism was 
genotyped as described by Jorgensen et al. (1997). The CAMP (alias PR39) 
locus is a homolog of the human gene for peptide antibiotic FALL-39. 

Linkage analysis 

The genotyping data were used for the construction of an SSC 13 linkage 
map. The linkage analysis was performed using CRIMAP version 2.4 (Green 
et al., 1990). Initially, the option TWOPOINT was used to find linkage 
between the markers with a lod score higher than three. Subsequently, the 
option BUILD was used to construct the framework map and the remaining 
markers were incorporated using the option ALL. Finally, the genotypes were 
checked using the option CHROMPIC and the data was scrutinized for any 
unlikely double-recombinants. 


Cytogenetic mapping 

Microsatellite markers surrounding the F4ab/ac locus were used to 
screen a pig BAC library (Anderson et al., 2000). The BAC clones were iden¬ 
tified using PCR primers from markers Sw207, S0283, S0075 and Sw225 on 
DNA pools from the BAC clones. 

DNA was extracted from the marker-positive BAC clones using the Qiag- 
en Plasmid Midiprep kit (Qiagen, Germany) and the BACs were individually 
labelled with biotin-14-dATP or digoxigenin-11-dUTP (Boehringer-Mann- 
heim, Germany) for both single-color and dual-color FISH analysis to por¬ 
cine metaphase and interphase chromosomes as described by Chowdhary et 
al. (1995). 

Comparative mapping 

DNA from the FISH-mapped BACs was digested using Sau3 Al and the 
fragments were ligated into the BamHl site of pUC19 and transformed into 
Epicurian Coli XL 1-BLUE cells (Stratagene). The transformants were plated 
out on LB ampicillin plates and around 100 subclones for each BAC were 
picked at random. Plasmid DNA was isolated from the subclones using a 
Qiaprep spin miniprep kit (Qiagen, Germany). The inserts were sequenced 
using BigDye terminator sequencing (Applied Biosystems, Foster City, CA, 
USA) and T3 and T7 primers and electrophoresed on an ABI377 (Applied 
Biosystems, Foster City, CA, USA). The generated sequences were BLAST- 
searched against the non-redundant nucleotide database at NCBI website 
(http://www.ncbi.nlm.nih.gov/). 

The shotgun sequences with similarity to KIAA0804, TRAD and 
SEC22L2 were selected to further improve the comparative map between 
SSC 13 and HSA3. In addition expressed sequence tags (ESTs) predicted to 
map to SSC 13 on the basis of sequence similarity to genes and ESTs known 
to map to the orthologous human chromosome (HSA3) were selected from 
our resource of porcine small intestine ESTs (Wintero et al., 1996; Wintero 
and Fredholm, unpublished). The criterion for selection was that the 5' 
cDNA sequences of the respective clones had significant sequence identity 
(expectation values < e-6) with the human sequences in the non-redundant 
nucleotide database as revealed using BLAST (Altschul et al., 1990, 
URL=http://www.ncbi.nih.gov/BLAST/). 

Primers were designed in the 3' UTR region of the selected clones (see 
Table 2) in order to increase pig specificity. The Primer3 website at http:// 
www-genome.wi.mit.edu/cgi-bin/primer/primer3_www.cgi (Rozen and Skal- 
etsky, 2000) was used for designing primers. A pig somatic cell hybrid panel 
(Yerle et al., 1996) and a pig radiation hybrid panel (IMpRH) (Yerle et al., 
1998) were used for regional assignment and mapping. PCR was performed 
as previously described (Cirera et al., 2003). The PCR results were directly 
introduced into the SCH and RH data analysis programs at http://www. 
toulouse.inra.fr/lgc/pig/pcr/pcr.htm and http://imprh.toulouse.inra.fr/, 
respectively. Regional assignment was achieved by using the computer pro¬ 
gram developed by Chevalet et al. (1997). The results of the radiation hybrid 
PCR products were analysed with the IMpRH mapping tool developed by 
Milan et al. (2000). 


Results 

Genotyping 

Of the sixty tested microsatellite markers the following 41 
markers were informative, easy to score and therefore geno¬ 
typed in the family material: S0075, S0076, S0084, SO 103, 
S0215, S0219, S0222, S0281, S0282, S0283, S0287, S0291, 
SW1030, SW1056, SW129, SW163, SW1833, SW1864, 
SW1876, SW1898, SW1930, SW2054, SW207, SW2196, 
SW225, SW2412, SW398, SW458, SW482, SW520, SW698, 
SW769, SW864, SW873, SW882, SW937, SW955, SW992, 
SWR1008, SWR428 and SWR926. A framework linkage map 
was constructed using the following 28 most informative mark¬ 
ers: S0075, S0076, S0215, S0219, S0222, S0281, S0282, S0291, 
SW1030, SW1056, SW1864, SW1898, SW207, SW2196, 
SW225, SW2412, SW398, SW458, SW698, SW769, SW864, 
SW873, SW882, SW937, SW955, SWR1008, SWR428 and 
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S0282 


SWR926. Detailed descriptions of the genetic markers can be 
found on the Web site of the USDA Meat Animal Research 
Center (http://www.marc.usda.gov/) and in the pig genome 
database (http://www.thearkdb.org/pig). 
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Linkage analysis 

The framework map is shown in Fig. 1 (marker order sup¬ 
ported by odds 1000:1). The total length of the map is 159 cM. 
When the data for the F4ab/F4ac locus (the location of the gene 
conferring resistance/susceptibility to ETEC F4 induced dis¬ 
ease) are added the most likely position for this locus is distal to 
Sw207 and proximal to Sw225 supported by a lod score of more 
than 3. Using the FLIPS option the most likely position for this 
locus is proximal to S0075 (Fig. 1) in comparison with the next 
best order where ETEC F4ab/ac locus is located distal to S0075. 
A total of four putative recombinants were observed between 
F4ab and F4ac loci, but they all appeared as unlikely double 
crossovers, i.e. none of the putative recombinant events were 
supported by data on the flanking markers. We found ten puta¬ 
tive recombination events between S0075 and F4ab/ac. Of 
these ten, nine appeared as double recombinants for the F4ab/ 
ac locus alone and one was a single recombination between 
S0075 and the ETEC F4ab/ac supported by flanking markers. 

Cytogenetic mapping 

The physical order of the markers was shown to be in accor¬ 
dance with the linkage data namely CEN-Sw207-S0075- 
Sw225-TEL. Although the S0283 marker could not be ordered 
on the framework map, the marker could be assigned to the 
interval between S207 and Sw225 and thus it was used for BAC 
screening. The BAC positive for the S0283 marker was as¬ 
signed to the interval between Sw207 and S0075. BACs con¬ 
taining markers Sw207, S0283 and S0075 were all observed to 
hybridise to pig chromosome 13 band q41, whereas the Sw225 
BAC hybridised to 13q44. 

Comparative mapping 

The genes identified after the shotgun sequencing of the 
BACs are listed in Table 1. 


Table 1. Genes identified in the isolated BAC clones. The human data was taken from the Entrez 
Genome view build 33 at http://www.ncbi.nlm.nih.gov/mapview/map_search.cgi? 


Marker 

BAC 

Cytogenetic position 
on SSC 13 

Map element 

HSA3 position 

Sw207 

PigEBAC169olO 

q41 

KIAA0804 

186 Mbp (q27.3—>q28) 

S0283 

PigEBAC177ol 1 

q41 

TRAD 

126 Mbp (q21.1) 

S0075 

PigEBAC169fl5 

q41 

ADCY5, SEC22L2 

124 Mbp (q 13—>q21) 

Sw225 

PigEBAC76g23 

q44 

AC063923.21 

110 Mbp (ql3.1) 




Swr428 



S0076 




CAMP 

Sw458 



Sw864 

DAG1 



S0222 

Swr1008 

Sw1864 

Sw2412 

Sw937 

Sw882 

Swr926 

S0281 

TF 

CP 

Sw1898 

Sw2196 

Sw207 

S0075 

Sw225 

Sw955 

Sw873 

Sw1030 

Sw698 

Sw398 


ETEC F4ab/ac 


120 H 


Sw1056 


130 H 


140 H 


Sw769 

S0215 


150 H 


Fig.1 . A framework map of SSC13 showing the location of the F4ab/ac locus (in italics). 


*— S0291 

Total map dist: 159.1 cM 
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Table 2. List of pig genes mapped in this study 


Map element 

Clone 

GenBank 

HSA3 position 

Pig cytogenetic 
• • 

Closest marker 

PCR primers 


(b value) 

(labid.) 

Ace. No. 


position 

on the pig RH 
map (lod score) 

Upper 

Lower 

AC063923.21 

(2e-20) 

PigEBAC 

76g23 

AY156081 

llOMbp 
(3ql 3.1) 

13q44 

only FISH 
mapped 

not applicable 

not applicable 

ADPRTL3 

(6e-13) 

cl6b04 

AJ508800 

52 Mbp 
(3p22.2-p21.1) 

13q21—q22 

SSC24F05 

(9.65) 

5 ’-CCCAGCCATGCTAGGACTAA 

5’-AGATTCGCCTCTGAGGTGTC 

ARF4 

(le-133) 

cllflO 

AJ508808 

57 Mbp 
(3p21.2—p21.1) 

13q21—q22 or 
13q23-l/2q41 

SWR2054 

(6.7) 

5 ’ -ACCAAAAGCAACATGCAACA 

5 ’-CAGGGAATGCTCCAAAACAC 

ARMET 

(2e-74) 

cl5g08 

AJ508798 

51 Mbp 
(3p21.1) 

13q21-q22 or 
13q23-l/2q41 

SSC24F05 

(6.95) 

5-TAGTGTAAACCCGCAACAGA 

5 ’-AACAGTTC ATCTGTGTCTTC 

GPX1 

(le-6) 

cl7dl 1 

AJ508799 

49 Mbp 
(3p21.3) 

13q21-q22 or 
13q23-l/2q41 

SSC24F05 

(16.6) 

5’-TAGTGAGGAACTGTGGTCTG 

5 ’-ATATCG AGCCTGACATCGAA 

KIAA0804 

(3e-49) 

PigEBAC 

169ol0 

AY156078 

186 Mbp 
(3q27.3-q28) 

13q41 

SW207 

(15.41) 

5 ’ -CTATGTGCCC ATGTGC ATTC 

5 ’ - AACCTG AG AGC ATCGGTC AC 

KIAA1363 

(le-64) 

c03b02 

AJ508807 

174 Mbp 
(3q26.1-q26.33) 

13q23-l/2q41 

S0084 

(9.3) 

5 ’-TCAAGAGGGGCTCAAC ACTT 

5’-TGGAATCATGTACGCAAAGC 

MME 

(2e-77) 

cl4c07 

AJ508801 

156 Mbp 
(3q25.1-25.2) 

13q23-l/2q41 

SW1495 

(4.4) 

5 ’-CATATCCACTCCAGGG ACAC 

5 ’-ACCAAGACAGTTATG AACCA 

RFC4 

(7e-30) 

cl8a04 

AJ508811 

188 Mbp 
(3q27) 

13(l/2q46-q49) 

SI ATI 
(20.63) 

5’-CGGTGCTTTGGTCATTTTTA 

5 ’-TGCTTAGCTGATGGTGCTGA 

RPL29 

(3e-80) 

cl lb05 

AJ508797 

52 Mbp 
(3p21.3—p21.2) 

13 q21—q41 

SW864 

(7.01) 

5 ’ -G AC AG ATCCTG AGGC AGGTT 

5’-CAGGTTCTGCCGGCCAAAGT 

RYBP 

(7e-43) 

cl7g07 

AJ508795 

72 Mbp 
(3p 14.1) 

13q23-l/2q41 

SWR1008 

(11.21) 

5’-AAGCAGAGCAGGTCAATTAAGG 

5 ’-TATTC AGCGGCACAGTAAGC 

SEC22L2 

(2e-60) 

PigEBAC 

169fl 5 

AY156080 

124 Mbp 
(3ql3-q21) 

13q41 

S0075 

(20.84) 

5 ’ -CC AGCCGGTGTAGTAG AC A AG 

5’-CCCTTTTAAGGTGTGGAGCTT 

SEP1 

(2e-43) 

cl8g08 

AJ508802 

143 Mbp 
(3q23) 

13q23-l/2q41 

SW882 

(16.16) 

5 ’-ACAGCATGAAAAGTGCCTG A 

5’-TCCATATCTGTGTCTCATAAAAA 

SST 

(le-131) 

c09c04 

AJ508810 

189 Mbp 
(3q28) 

13(l/2q41) or 
13(l/2q46-q49) 

SIAT1 

(13.44) 

5 ’ -TTTGG AGG AG AGG A ATTGG A 

5’-TGGAGCCTGAAGATTTGTCC 

TFDP2 

(3e-83) 

cl3g02 

AJ508806 

143 Mbp 
(3q23) 

13q23-l/2q41 

SW2459 

(11.56) 

5 ’-ATAGTAAAACGCGGGTTTGC 

5 ’-GCTGAAGTGGCCTTAGC AAC 

TFG 

(6e-64) 

cl7b07 

AJ508803 

102 Mbp 
(3ql 1 —q 12) 

13q42-l/2q46 

SWR1306 

(5) 

5 ’-AGATG ACTG AACTTCAACCTAGC A 

5’-AGCAGCTTCCTAGTTACTTTGG 

TRAD 

(5e-76) 

PigEBAC 
177ol1 

AY 156082 

126 Mbp 
(3q21.1) 

13q41 

S0075 

(13.99) 

5 ’ -C AGG A AG AGCCCCCTA A ATC 

5 ’-C AGC AAAGGC AG AAACCTTC 


In addition to the sequences derived from the shotgun 
sequencing, a total of 13 pig ESTs were mapped using the 
somatic hybrid cell panel and the radiation panel. The genes 
and their localisations are shown in Table 2. 

Based on the mapping data and the matches between the 
sequences derived from the pig BAC clones/pig cDNA clones, 
and the human genomic sequence a comparative map between 
SSC13 and HSA3 was drawn (Fig. 2). 

Discussion 

A framework linkage map of SSC13 was constructed and the 
linkage analysis positioned the F4ab/ac locus between markers 
Sw207 and Sw225 with a TOD score higher than 3. A total of 
four putative recombinants were observed between the F4ab 
and F4ac loci, but they all represented putative double recom¬ 
binants over an interval of only 6 cM in a region for which there 
was very good marker coverage. We therefore conclude that 
these are most likely false recombinants arising from deviations 
from a strict monogenic inheritance or typing errors in the 
F4ab/ac adhesion test. Problems with genotype ascertainment 
based on assays such as the adhesion test which are difficult to 
interpret and that can generate spurious putative recombinants 


are a challenge for fine-scale linkage mapping. In agreement 
with the results of Python et al. (2002) we therefore believe that 
susceptibility towards F4ab and F4ac is controlled by the same 
locus. Thus, we have mapped the locus responsible for suscepti¬ 
bility to ETEC F4ab- and F4ac-induced diarrhoea with confi¬ 
dence to a genetic interval of 6 cM. 

In a region of only 6 cM we would expect less than four 
double crossovers in 1000 meioses, assuming no interference. 
All the double recombination events in the Sw207-Sw225 
region were due to the results of the adhesion test. Thus, these 
data were considered highly unlikely and excluded from the 
analysis simply by eliminating the genotypes from the animals 
showing double recombinations. After scrutinising the data we 
observed only a single believable recombination between 
S0075 and the ETEC F4ab/ac locus among 284 informative 
meioses. The single recombination event between S0075 and 
ETEC F4ab/ac indicates that the most probable position of 
ETEC F4ab/ac is in the region between Sw207 and S0075, an 
interval of 3.8 cM, however the lod score support for this posi¬ 
tion in comparison with a position distal to S0075 is lower than 
three. However if we consider the single recombination, the 
most probable order and the genetic distances become Sw207- 
3.4 cM-ETECF4ab/ac-0.4 cM-S0075-1.6 cM-Sw225. 
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Fig. 2. A comparative map between SSC13 and HSA3 using the mapping 
results in this study. 


BACs isolated using the markers from the candidate inter¬ 
val were used to cytogenetically map the region containing the 
F4ab/F4ac resistance/susceptibility gene and as a source of 
sequence information for comparative mapping. By FISH anal¬ 
ysis the candidate interval was cytogenetically anchored to the 


SSC13q41 ->q44 region. If only the Sw207-S0075 interval is 
considered the region is limited to the SSC13q41 band. This 
result further refines and anchors the region suggested by 
Python and coworkers (2002). 

The comparative mapping analyses described here were 
performed at two levels. First, pig-expressed tag sequences 
(ESTs) predicted to map to chromosome 13 on the basis that 
their human homologues map to HSA3 were mapped in somat¬ 
ic cell hybrid and radiation hybrid mapping panels. The result¬ 
ing gene map of pig chromosome 13 allowed us to align this 
chromosome with its known human homologue HSA3. The 
comparative mapping data generated in this study primarily 
refines the gene mapping around the ETECF4ab/ac regions, 
but all data are in agreement with previously reported pig- 
human data (Sun et ah, 1999; Van Poucke et al., 1999, 2001; 
Pinton et al., 2000). Secondly, in order to develop a more 
detailed comparative pig-human gene map of the region con¬ 
taining the F4ab/F4ac susceptibility/resistance locus, we sam¬ 
ple sequenced BAC clones known to contain genetic markers 
from the region of interest. Unfortunately, the sequences asso¬ 
ciated with the markers Sw207 and Sw225, that define the con¬ 
fidence interval for the F4ab/F4ac locus, show homology to 
locations on HSA3 that are about 70-80 Mbp apart. Compari¬ 
sons of sequences associated with markers within the Sw207- 
Sw225 interval (i.e. S0283 and S0075) with the human genome 
sequence indicate that pig chromosome 13 and human chromo¬ 
some 3 share homology but are not co-linear over the interval 
between Sw207 and Sw225. 

By combining the linkage data with the comparative data 
two regions on HSA3 can be pointed out as potential candidate 
regions in the search for the gene responsible for susceptibility 
towards E. coli F4ab/ac diarrhoea in pigs. The corresponding 
human regions are on HSA3q21 and q28 -» qtel. If we only con¬ 
sider the Sw207-S0075 region the orthologous candidate gene 
region is HSA3q28—>qtel. 
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Abstract. The PRKAG3 gene encodes the y3 chain of AMP- 
activated protein kinase (AMPK). A non-conservative mis- 
sense mutation in the PRKAG3 gene causes a dominant pheno¬ 
type involving abnormally high glycogen content in pig skeletal 
muscle. We have determined > 126 kb (in 13 contigs) of porcine 
genomic sequence surrounding the PRKAG3 gene and the cor¬ 
responding mouse region covering the gene. A comparison of 
these PRKAG3 sequences and the human sequence was con¬ 
ducted and used to predict evolutionarily conserved regions, 
including regulatory regions. A comparison of the human 
genomic sequence and a porcine BAC sequence containing the 
PRKAG3 gene, revealed a conserved organization and the 
presence of three additional genes, CYP27A1 (cytochrome 


P450, family 27, subfamily A, polypeptide 1), STK36 (Serine 
Threonine Kinase 36), and the homolog of the unidentified 
human mRNA KIAA0173. Interspersed repetitive elements 
constituted 51.4 and 38.6% of this genomic region in human 
and pig, respectively. We were able to reliably align 12.6 kb of 
orthologous repeats shared between pig and human and these 
showed an average sequence identity of 72.4%. Our analysis 
revealed that the human KIAA0173 gene harbors alternative 5 7 
untranslated exons originating from repetitive elements. This 
provides an obvious example how transposable elements may 
affect gene evolution. 

Copyright©2003 S. Karger AG, Basel 


The PRKAG3 gene was identified by positional cloning as 
the causative gene for a dominant phenotype involving high 
glycogen content in pig skeletal muscle (Milan et al., 2000; Jeon 
et al., 2001). PRKAG3 encodes the y3 chain of AMP-activated 
protein kinase (AMPK) that is predominantly expressed in 
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skeletal muscle. A BAC clone containing the gene and covering 
about 130 kb was sequenced as part of the positional cloning 
effort. Since only a limited amount of porcine genome se¬ 
quences is publicly available at present, our sequence data pro¬ 
vided an opportunity to study the level of conservation of cod¬ 
ing and non-coding sequences between the human and porcine 
genomes. 

With the increasing amount of data generated by large scale 
sequencing efforts, there is a need for tools to identify function¬ 
ally important sequences. Several computational tools have 
been developed for gene predictions and homology searches 
against EST databases. Furthermore, besides the identification 
of coding sequences, a more difficult task is the identification 
and characterization of non-coding functional elements, e.g. 
promoters, untranslated exons, enhancers, and silencers. Com¬ 
parative genome sequencing appears to be the most powerful 
approach to identify evolutionarily conserved non-coding se¬ 
quences. The functional characterization of these elements still 
requires laborious experimental studies. Human and mouse are 
by far the two mammalian genomes for which the largest 
amount of genomic sequence is available and sequence com- 
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parison between them has proven to be efficient for the identifi¬ 
cation of coding sequences as well as regulatory elements 
(Mouse genome sequencing consortium, 2002). 

Comparative sequence analysis is also a powerful tool for 
understanding gene evolution. The majority of genes are orga¬ 
nized into families and superfamilies, reflecting an ancient and 
continuing process of gene duplication and divergence. Inter¬ 
spersed repeats are widespread in the genome and may in¬ 
fluence gene evolution as random integrations occur in both 
coding sequences and regulatory elements (International hu¬ 
man genome sequencing consortium, 2001; Mouse genome 
sequencing consortium, 2002; Jordan et al., 2003). With the 
completion of the human genome sequence, it became possible 
to observe that repeats have reshaped the genome by causing 
ectopic rearrangements, creating entirely new genes, modifying 
and reshuffling existing genes (International human genome 
sequencing consortium, 2001; Venter et al., 2001). In this 
study, we compare i) the sequence of pig, mouse, and human 
PRKAG3 genes, ii) the location and sequence identity of inter¬ 
spersed repeats between human and pig in the vicinity of the 
PRKAG3 gene, and iii) the organization of three additional 
genes located near the PRKAG3 gene in these genomes. We 
also show that interspersed repeats can be involved in the for¬ 
mation of new exons in the 5' UTR of a gene, giving an example 
of gene evolution closely related to the dynamic nature of inter¬ 
spersed repeats. 

Materials and methods 

BAC sequencing 

The pig BAC clone 127G6 (Jeon et al., 2001) was sequenced as described 
in Amarger et al. (2002). In brief, the BAC clone was shotgun sequenced and 
most gaps were closed by primer walking. A mouse BAC library was screened 
using commercially available high-density filters (RPCI-22 or 23 library, 
ResGen, Invitrogen Corporation) with a partial pig PRKAG3 cDNA probe. 
BAC DNA was purified using an alkaline lysis method and digested by Bam- 
HI (New England Biolabs). Restriction fragments were separated in a 1 % 
agarose gel and transferred onto a nylon membrane (Hybond N+, Amers- 
ham-Pharmacia Biotech). The filter was then hybridized using an [a 32 P]- 
dCTP labeled (Megaprime DNA labeling kit, Amersham-Pharmacia Bio¬ 
tech) pig cDNA probe. Positive fragments containing the whole PRKAG3 
gene were then subcloned into Bam HI restricted pUC18 and sequenced by 
primer walking. The sequence data reported in this paper have been submit¬ 
ted to GenBank and have been assigned the accession numbers AY264345 
(pig PRKAG3 alternative transcript), AY263454 (pig BAC clone sequence), 
and AY263402 (mouse Prkag3 gene). 

Sequence assembly and analysis 

Sequences were assembled using the Phred/Phrap/Consed package (Ew¬ 
ing et al., 1998; Gordon et al., 1998). The assembled sequences were then 
analyzed with a variety of computer software programs. Sequence compari¬ 
son with cDNA sequences was done using pairwise BLAST at http://www. 
ncbi.nlm.nih.gov. Repetitive elements were localized and identified by 
RepeatMasker (A.F.A. Smit and P. Green, unpublished; http://ftp.genome. 
washington.edu/index.html). Sequence identity plots were obtained using 
VISTA (Dubchak et al., 2000; Mayor et al., 2000) at http://www-gsd.lbl.gov. 
CpG islands were identified using Grail/CpG within the NIX application 
(Williams GW, Woollard PM, Hingamp P. unpublished, http://www.hgmp. 
mrc.ac.uk/NIX/). 

RT-PCR analysis of the PRKAG 3 and CYP27A1 transcripts 

Adult pig tissue samples were immediately frozen in liquid nitrogen and 
stored at -70 °C until total RNA was prepared using TRIzol (GIBCO BRL) 
according to the manufacturer’s protocol. First-strand cDNA synthesis was 


Table 1. Exon and intron length of the PRKAG3, CYP27, KIAA0173, 
and STK36 genes in human and pig and mouse ( Prkag3 ) 


Exon/intron 

Exon length 3 


Intron length 3 


no. 

Human 

Pig 

Mouse 

Human 

Pig 

Mouse 

PRKAG3 







0 


341 b 



588 b 


1 

33 

108 

33 

362 

302 

304 

2 

40 

40 

40 

434 

478 

447 

3 

156 

156 

153 

361 

360 

313 

4 

404 

404 

407 

1,377 

890 

1,200 

5 

82 

82 

82 

456 

460 

114 

6 

59 

59 

59 

125 

101 

100 

7 

46 

46 

46 

203 

216 

179 

8 

55 

55 

55 

201 

201 

469 

9 

127 

127 

127 

154 

132 

99 

10 

166 

166 

166 

2,349 

1,127 

552 

11 

38 

38 

38 

170 

175 

168 

12 

147 

147 

147 

341 

356 

330 

13 

117 

117 

117 




3'UTR 

116 c 

477 


569 c 




694 c 






CYP27 







1 

255 

261 


27,139 

>32,593 


2 

191 

191 


2,454 

>4,303 


3 

200 

200 


130 

126 


4 

198 

198 


173 

131 


5 

173 

173 


924 

923 


6 

167 

167 


192 

201 


7 

79 

82 


86 

94 


8 

213 

213 


153 

131 


9 

120 

120 





3'UTR 

236 

217 





KIAA0173 







1 

30 

30 


16,231 

ND 


2 

79 

- 


10,231 

ND 


3 

1,585 

1,586 


894 

1041 


4 

110 

110 


353 

355 


5 

64 

64 


4,524 

>5,754 


6 

125 

122 


457 

448 


7 

111 

111 


353 

348 


8 

77 

77 


771 

736 


9 

192 

192 


95 

99 


10 

83 

83 


224 

232 


11 

129 

129 


379 

ND 


12 

123 

ND 


608 

ND 


13 

99 

ND 


318 

ND 


14 

135 

135 


621 

397 


15 

138 

138 


1,557 

1,270 


16 

103 

103 


956 

599 


17 

211 

211 


141 

142 


18 

71 

71 


398 

393 


19 

86 

86 


464 

329 


20 

256 

259 





3'UTR 

1,024 






STK36 

15 

151 

151 


289 

173 


16 

128 

128 


529 

562 


17 

105 

105 


378 

348 


18 

99 

99 


73 

169 


19 

89 

89 


280 

256 


20 

64 

64 


197 

154 


21 

111 

111 


2,405 

2,228 


22 

175 

175 


249 

246 


23 

148 

148 


243 

226 


24 

148 

148 


600 

539 


25 

747 

747 


1,787 

1,454 


26 

144 

144 





3'UTR 

699 







a ND not determined. 

b Exon 0 is found only in pig in an alternative transcript (see Results). 
c Human 3' UTR consists of two exons. 
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done using total RNA samples following the manufacturer’s instructions 
(Amersham Pharmacia Biotech). RT-PCR analysis using human cDNA was 
performed on human skeletal muscle double-stranded cDNA (BD Bio¬ 
sciences Clontech). 

Results 

Sequencing of porcine BAC clone 127G6 and the mouse 

Prkag3 gene 

About 126 kb of sequence were obtained by shotgun 
sequencing of the BAC clone 127G6 (GenBank ace. no. 
AY263454). This sequence consists of 13 contigs ranging from 
175 to 25,977 bp. The order and orientation of the contigs have 
been determined according to the comparison with the human 
sequence and confirmed by PCR amplification using the BAC 
DNA as template. Remaining gaps do not exceed several 
hundred base pairs and are mostly due to sequencing problems, 
usually because of large stretches of A or T nucleotides situated 
at the end of repetitive sequences. 

A contiguous sequence spanning 10,480 bp and containing 
the mouse Prkag3 gene (GenBank ace. no. AY263402) was 
obtained from the mouse BAC clone. This sequence was 
obtained from three BamHl restriction fragments detected by 
Southern hybridization of BAC DNA and cloned into pUC18. 

Comparative sequence analysis of PRKAG3 between 

human, pig, and mouse 

Genomic sequences were determined for the PRKAG3 gene 
in pig and mouse. Pig cDNA sequence was described earlier 
(Milan et al. 2000). The human cDNA and genomic sequences 
were available in GenBank (NM_017431 and AC009974, re¬ 
spectively). Alignment of cDNA and genomic sequences in the 
three species provided evidence of a conserved exon/intron 
organization and all splice acceptor and donor sequences were 
found to conform to the GT - AG rule. Exon lengths are per¬ 
fectly conserved with the exception of exon 1 which is longer in 
the pig (108 versus 33 bp in human and mouse; Table 1). The 
coding sequence is highly conserved with 85.3, 84.4, and 83.1% 
identity between human/pig, human/mouse, and pig/mouse, 
respectively. Alignment of the human, pig, and mouse 
PRKAG3 sequences using VISTA, with the criterion of 75% 
identity over 100 bp, revealed several conserved regions out¬ 
side the coding sequence. They cover 2.9 kb between human 
and pig with an average sequence identity of 66.2%. Among 
these regions, there is a 350-bp long 3' UTR and several con¬ 
served blocks in the 5 7 region of the gene, covering 1,250 bp 
with sequence identities varying from 75 to 80%. The sequence 
identity scores were significantly lower for non-coding regions 
in comparison with the mouse; the aligned regions covered 800 
and 320 bp with 76.4 and 75.6% identity between human/ 
mouse and pig/mouse, respectively. In the 5 7 region of the gene, 
two regions are conserved in the three species, including a 
100-bp 5 7 UTR present in a mouse EST (GenBank ace. no. 
BB630381) and a 220-bp segment located about 700 bp up¬ 
stream of the start codon (Fig. 1 A). This latter segment contains 
a highly GA-rich region of approximately 100 bp that is very 
well conserved between the three species, suggesting a potential 
function in gene regulation. 


Available pig and human PRKAG3 cDNA sequences har¬ 
bor very short 5 7 UTRs, suggesting that they might not be full 
length. We performed RT-PCR on human and pig skeletal 
muscle cDNA using primers matching highly conserved re¬ 
gions in order to identify potential 5 7 UTR exons. One up¬ 
stream exon was identified in pig only. Surprisingly, the 
obtained transcript contained a 5 7 exon (referred here as exon 0, 
see Table 1 and Fig. 1A) and exons 2-13 but lacked exon 1 and 
consequently the start codon. A putative start codon exists in 
exon 0 but is followed by a STOP codon 36 bp thereafter. The 
next putative start codon is in exon 3. An alternative mouse 
transcript is also found in the EST database (GenBank ace. no 
AI664508), harboring a longer exon 1 (136 bp longer at the 3 7 
end) and a longer exon 2 (60 bp longer at 5 7 end) but both exons 
contain STOP codons. Surprisingly, there are no human ESTs 
containing the 5 7 region of the gene and we could not obtain any 
RT-PCR product containing any of the 5 7 conserved regions, 
suggesting that the human transcripts do not have longer 5 7 
UTR. 

There is not yet any experimental evidence of the location of 
the PRKAG3 gene promoter. Several promoter prediction pro¬ 
grams were used on the human, pig, and mouse sequences, but 
none of the predicted promoters are common between the three 
species or at least two of them. The conserved region situated 
immediately upstream of exon 1 contains one of the promoters 
that was predicted from the pig sequence. A closer examination 
of the alignment of the three sequences in that region (Fig. IB) 
shows that it is highly conserved between the three species. It 
harbors some characteristics of a CORE promoter (1) an imper¬ 
fect TATA box, (2) a GC rich element upstream of the TATA 
box, (3) a CCAAT box about 50 bp upstream of the TATA box, 
(4) a transcription start motif CACT. The fact that this 
sequence was identified as a putative promoter only in the pig 
can be due to the structure of the putative TATA box (AAAA- 
TA in pig, AGAATA in human and mouse). Putative transcrip¬ 
tion factor binding sites are present in this region, among which 
three SP1 binding sites and several E-boxes (consensus 
GANNTG). E-boxes are known as the DNA-binding sites for 
helix-loop-helix family of transcription factors including the 
muscle-specific proteins of the MyoD family (Wright, 1992) 
and have been characterized in a number of skeletal muscle- 
specific promoters (Baker et al., 1998; Van Maanen et al., 
1999). Further experimental studies are required to character¬ 
ize the PRKAG3 promoter. 

Comparative sequence analysis of the whole ~ 126-kb 

region between human and pig 

Three complete genes and one partial gene are present in 
BAC 127G6. A BLAST search of the entire pig BAC sequence 
against the human genome sequence revealed the presence of 
three genes in addition to PRKAG3: the Cytochrome P450, 
family 27, subfamily A, polypeptide 1 gene (CYP27A1), the 
homolog of an unidentified human mRNA (KIAA0173), and 
part of the Serine Threonine Kinase 36 (STK36) gene; the latter 
was previously designated as KIAA1278. The order, orienta¬ 
tion, structure, and sequence of these three genes are conserved 
between human and pig (Table 1 and Fig. 2), except for the 
5 7 UTR of the KIAA017 3 gene (see below). Exons 2,12, and 13 
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Pig alternative transcript 



2 3 exons 



Mouse EST BB629521 
Mouse EST AI664508 


GA rich region 




I 


Fig. 1. (A) Comparative structure of the 5' 
region of the PRKAG3 gene. Non-coding con¬ 
served segments are in light grey, exons in dark 
grey. Pig and mouse alternative transcript are 
shown. (B) Sequence alignment of the human, pig, 
and mouse sequences situated upstream of the 
start codon. Potential promoter consensus motifs 
and transcription factor binding sites are indi¬ 
cated. 


E box / MYOD-Q6 


E box 


CACCTGTCCTCAGAGACCCAGGAGACAGCCCGGGACC—AGGCATCAAGATT-CCAGGTG 


AGCTTGTCCTCAGTGACCCAGGAGG |CAGCTGl AGGACC—AAGTACCCAGATTATCCGGTG 

-AGGAGAAAGTGA—AGGAAGCCAACTTTTCTCCTTAAAACTTTCAGTTCGAAGATG 

SP1 E box / MYOD -Q6 E box 

Iagcgggtgga 


GGCCCCACCCCTCCCCAGGGGGCCCCCAGGCTGCTGGGCTGGAGCAGCTG 


CGCCCCTTCCCTCCC—AGCAACCCCCAGCCTTCAGGGCTGTAG 


AGAGTTCCG CCTCC GC-AGC-GCCCCCAAGCTGCAGTGCGGGAAICAGCTG 

SP1 CAATbox? 


CAGCTG 


AG lCAAATGl GG 

—CGGGTTAA 
SP1 


CCCCTTTCA CCCTGC TC-CCTGCCACC CAAT CAGAGAGAACCCGATC-CACG- AGGGCAG 
GGC CCCTCCC TCTCATTGCCTGACACC CAAT CAGAGAGAAACCGATC-CTG GCAGGGCAG 
AAGCGCTCCCTCCCAAGACCTGA-CAATCAGAGAGAACCCGATCTCTCA-AGGGCCT 


TATA box ? 

GGT GCCAGGGGTCGGGCCCAA AATA GTGCTGCCCAGATACAG—TGTTGCGCACT- 

GGT GCCCGGGGCCGGGCCCAG AATA GTGCAGCCCAGCCACAG—TGTCGCACACTTGCTC 
GCCGCCAGGGACCAG-CCCAGAATAGTGCCACCCTGCCGCCGCCTGTCGCGCACTTGCTA 


-CCTCC ATG GAG 

TCAGTTGGTCTGGG-GCTGGCCAC ATG GAG 
AC-GGACTAGA-GCTGGCCGCATGGAG 


Start codon 


of the KIAA0173 gene were not found in the pig because of the 
presence of several gaps in the sequence or because they do not 
exist (see below for exon 2). Two CpG islands were identified in 
the same position in both species. They correspond to the pro¬ 
moter region of CYP27A1 and KIAA0173 (Fig. 2). 

Position and sequence conservation of repetitive sequences. 
We performed an alignment of the whole region with VISTA 
using the parameters previously cited (75% identity over 
100 bp). As many as 218 conserved segments were identified, 
covering a total of 42.6 kb (29.8 and 33.9% of the human and 
pig sequences, respectively) with 77.8% identity on average. 
Forty-seven of these regions correspond to exons, covering 
8.4 kb (19.8% of the alignable sequence). 

The human and pig sequences were screened for repeats 
using the RepeatMasker database. Species-specific SINE re¬ 
peats cover 20.5 and 12.3% of the human and pig sequences, 
respectively. This discrepancy can be partially explained by the 
fact that pig-specific repeats are less well characterized than 
human ones. Furthermore, the ability of the RepeatMasker 
program to detect divergent repeats is inefficient above 37% 
divergence (Smit, 1999). However, it is most likely that this 
discrepancy primarily reflects a true difference in the species- 
specific repeat content between human and pig in this region. 


Other SINE repeats such as Mammalian Interspersed Repeats 
(MIR) and Monotreme (MON) are present in very similar 
amounts (1.7 and 1.6% in human and pig, respectively), and 
most of them are conserved in position and show a high 
sequence similarity: 67% of human and 81 % of pig sequences 
corresponding to MIR or MON repeats display an average 
sequence identity of 79.8% between the two species (Table 2). 
LINE LI repeats are as expected the most abundant of the 
LINE family, covering 12 % of the human sequence and 16 % of 
the pig sequence. About half of the human LI repeats are found 
in the corresponding position in the pig and they show a fair 


Fig. 2. Sequence identity plots for pairwise comparisons of the human 
and pig regions. The human sequence is taken from GenBank AC009974 
(positions 29,445-173,047 and reverse-complemented). Aligned sequences 
are shown relative to their positions in the human sequence (horizontal axes), 
and the percentage of identities (50-100%) is indicated on the vertical axes. 
Regions with less than 50% sequence identity could not be aligned ambi¬ 
guously. The grey lines below the profiles indicate the position of the 13 dif¬ 
ferent pig contigs. The locations of the genes and their exons are shown above 
the profile. MIRs, LINEs, and DNA transposons are indicated. 
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Table 2. Number, length, and sequence identity of interspersed repetitive sequences between human and pig 


Repeat family 

Number of repeats 

Total length (kb) 

% of total 

sequence 

Number of repeats 

Number of 

Length of alignable 


Human 

Pig 

Human 

Pig 

Human 

Pig 

with conserved 
position 

alignable 
repeats 3 () b 

sequence /identity 
(kb / %) 

LINE 

LI 

26 

40 

17.1 

20.0 

12.0 

15.9 

13 

12(2) 

4.7/73.1 

L2 

13 

12 

3.5 

2.7 

2.4 

2.2 

10 

10(1) 

3.0/69.2 

SINE 

Alu 

109 

— 

29.4 

— 

20.5 

— 

— 

— 


Pig-specific 

— 

75 

— 

15.4 

— 

12.3 

— 

— 


MIR+MON 

17 

19 

2.5 

2.0 

1.7 

1.6 

12 

13(3) 

1.7/79.8 

LTRs 

27 

17 

17.1 

5.4 

12.0 

4.3 

7 

5 

1.5/68.7 

DNA transposons 

13 

9 

3.9 

3.0 

2.8 

2.4 

7 

6 

1.8/72.3 

Total 

205 

172 

73.4 

48.6 

51.4 

38.6 

49 

46 

12.6/72.4 


a Number of repeats with conserved position and showing a sequence similarity allowing a reliable alignment. 
b Number of repeats with conserved positions between human and pig but identified by RepeatMasker in one species only. 
c Total length of sequence inside repeats that can be aligned without ambiguity between the two species. 
d Average sequence identity between human and pig in the part of repeats that can be aligned. 


sequence identity in the parts that can be aligned without ambi¬ 
guity. The shared LINE elements are often interrupted by the 
subsequent introduction of more recently spread repeats, usual¬ 
ly species-specific SINEs, and therefore it becomes difficult to 
perform an alignment between the same repeat found in both 
species. Most LINE L2 elements in this region are shared 
between pig and human: ten out of 12 L2 elements are located 
in the same position in the pig and human genome with an 
average sequence identity of 69%. The proportion of the 
sequence covered by LTR elements is about three times higher 
in human compared to pig, although their total number is only 
1.5 times higher. These elements thus seem to be shorter in the 
pig or they have diverged to such an extent that they are not 
detected by the RepeatMasker program. About one third of 
detected pig LTR sequences are shared with human. DNA 
transposons represent a small proportion of the sequence 
(3.9 kb in human and 2.9 kb in pig), and they show a fairly high 
sequence identity (72.3 %; Table 2). Overall, repetitive and cod¬ 
ing sequences constitute about 30 and 20 %, respectively, of the 
sequence that can be aligned. 

Large regions that cannot be aligned correspond to the pres¬ 
ence of repetitive elements in one species only. This is more 
pronounced in the human sequence where large stretches of 
repeats are observed. These regions appear clearly on the VIS¬ 
TA plot (Fig. 2) because the distances on the horizontal axis are 
given according to the human sequence. The 11-kb region, 
between positions 35 and 46 kb in the human sequence, con¬ 
sists entirely of LTR repeats interrupted by Alu repeats. The 
corresponding region in the pig also contains LTR elements 
and SINE repeats but there is no detectable sequence similarity 
to human. In the region between 101 and 110 kb, a large num¬ 
ber of repeats (mostly LINE LI) are present in the human 
sequence but few of them were detected in the pig. The human 
region spanning from 113 to 116.5 kb mostly contains LTRs 
and Alus. There is a large LI repeat between positions 61.6 and 


68.4 kb in the human sequence. This is a complete repeat, inter¬ 
rupted by two Alu repeats. The insertion of this LINE LI 
occurred in human probably inside a MIR element that must 
have been present before the divergence of human and pig as it 
remains intact in the pig. We did not observe comparable large 
non-conserved stretches of repeats in the pig sequence. Repeats 
that are present only in the pig, appear to be more evenly spread 
and clusters of repeats do not exceed 3 kb, with the exception of a 
6-kb long cluster of LI and SINE repeats in CYP27A1 intron 1. 

KIAA0173 exon 15 is duplicated and transposed to the 
CYP27A1 intron 1 in the pig. During the assembly process, we 
noticed the presence of two copies of exon 15 of the KIAA0173 
gene in the pig sequence. This duplication involves 421 bp, 
including 31 bp upstream of exon 15, the exon itself, and 
285 bp of the sequence 3' of the exon. Specific PCR amplifica¬ 
tion and sequencing of the duplicated segment was performed 
on genomic DNA and confirmed that the observation was not 
due to a cloning or assembly artifact (data not shown). The 
duplicated segment is situated about 20 kb downstream of the 
original copy, inside the first intron of the CYP27A1 gene, and 
it displays 72% identity with the original copy. This copy of 
exon 15 is 33 bp shorter than normal, but the deletion does not 
disrupt the reading frame and the splice sites are conserved. 
Because of the localization of this duplicated exon inside 
CYP27A1 intron 1, we have searched for a possible insertion of 
this exon into the CYP27A1 transcript. RT-PCR experiments 
performed on kidney, liver, and skeletal muscle did not show 
any additional exon between exon 1 and exon 2 of the 
CYP27A1 cDNA, suggesting that the duplicated exon does not 
interfere with CYP27A1 transcription. The insertion of this 
duplicated segment occurred inside ORF2 of a LINE LI M2 
repetitive element (Fig. 3A). This LINE repeat is included in a 
large (6 kb) cluster of repeats (LINEs and SINEs) that is not 
present in the human sequence (see above). By comparing this 
element to the LIM2-orf2 consensus sequence (A.F.A. Smit 
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A 

ATTTGGACATGTGGTTGGAAACAATTTGGAGTCTGCTTTTACCATTCTTTATATCCTTTACCTGAAG 


TTAGAACTCCTTTTGCTGCTTCCCATAAGTTCTGGTTTTTACCACTCTTCGCATCCTTTTCCTGAAG 


LINE L1M2 


A 


CCTCCACTCCAGCTCTTCACTGGACATCAGCATCAAAGGCCAGATGATCCGTGACCTTTTGAACCTGGC 
CCTCTGTTCCCACTCCCCACTGGACATCAGCATCAAAGGC- 


GGGCTTTGTTCTGCCCCATGCAGAGGATATCATTTTCAGCTCCAGCAGCTCCAGCAGCTGTACCACCAG 
-TTTGTTCTGCCCAGCACAGAAGATGTTGTTTCCAGCTCTGGCAGCTCCAGCAGCTCCACCACCAA 


GTCAGCCTCCTTGCTTGTCTGCAGCTTGGCTGGCAGCATGAGCAGCCCCATCACGATTCTGGGAGGGGT 

GTCAGCTTCCTTGCTTGTCTGCAGTTCGGCTGGCGGCATGAGAAGCCCCATCATGACTCTGAGAAGGTT 


GG-TGTCTGAACACCCCCCTCGCAG-TCTGCCCAG- 

ACTCTCTGTCTGAATGTT-CCCTCCTGCTGGTCT-CCCTTACTGCTTGATCTGCCCTGCAATATGCGCA 

CACCACTGCCATGTCAGTGCCCATAAAAGCATCTTCAGTGAATTGCCTCCTTGCCCAAGAATCCCCACA 

CACCACTGCCATGTCAGTGCTCCTAAAAGCATGTTTGGTTAACTGCTTCCCTGCCCAGGAATCCCCATG 

CGGCCTGGGGAATCAGCTTTAAATTCGTCTAGCAGACATTCGAGGCCGCAGCCTTCTTTCCCCAGACGC 

GGGCCTGGGGTATCACCTTTAAATTCCTCAAGAAGATATTTGAGGCCTCAGCCTCGTTTTCTCAGGCTT 


CCCTC-CTTCCTTTCCTACTCTTCCCCAGCTCCTGTGCTTTGACAGACTGGATCTC 


TTCTCTCTTTAAGTTTTGATATGTTATGCTTTCCATTTTTATTAATATTTAGTTA 


A 


LINE L1M2 


Fig. 3. (A) Sequence alignment of the normal 
(top) and duplicated (bottom) copies of the pig 
KIAA0173 exon 15-containing segment together 
with the surrounding sequence. The beginning 
and end of the duplicated segment are indicated 
by black arrows. The exon is in bold and the splice 
sites are underlined. The LINE LI sequence 
where the insertion occurred is boxed. (B) Se¬ 
quence alignment between the LlM2-orf2 con¬ 
sensus sequence (RepeatMasker database) and 
the LINE LI sequence containing the insertion 
point of the duplicated segment. An 11 bp ele¬ 
ment duplicated on both sides of the insertion is 
boxed. 



LlM2_orf 2--TTTTGCTGTATCCCA-|TAGATTTTGGT 
TTTTTATT 


ATGTTGTG- 


AATGTTATGCTTTCCATT 




yTTTT.TCTCTT) 


Inserted segment 
(see above) 


and P. Green, unpublished, RepeatMasker at http://ftp.ge- 
nome.washington.edu/index.html), we found evidence for a 
target site duplication of an 11-bp sequence on both sides of the 
insertion as well as several stretches of T-nucleotides around 
the insertion point (Fig. 3B). The two copies of the 11-bp dupli¬ 
cated element differ from each other and from the consensus 
sequence by 2 and 3 bp, respectively. However, we can exclude 
the possibility of a LINE-related homologous recombination 
event since the native copy of the duplicated segment is not 
surrounded by any repetitive element. 

The human KIAA0173 gene harbors alternative 5' untrans¬ 
lated exons originating from repetitive elements. We first found 
evidence that the human KIAA0173 exon 2, present in the full- 
length cDNA sequence (GenBank ace. no. D79995), is derived 
from a human-specific LINE LI repetitive element (LlM3f; 
Fig. 4A). This element is interrupted by the insertion of several 
other repeats, including seven Alu elements and one MER ele¬ 
ment. The LlM3f repeat is 2,345 bp long and corresponds to 
the 5' end (base 1 to 2345) of the 6,328-bp long LlM3f consen¬ 
sus sequence present in the RepeatMasker database (A.F.A. 
Smit and P. Green, unpublished; http://ftp.genome.washing¬ 


ton.edu/index.html). There is no evidence for the presence of 
the y end of this repeat. Exon 2 corresponds to nucleotides 
1100-1178 of the repeat and is therefore most probably a part 
of ORF1 of the LI repeat, which encodes an RNA binding pro¬ 
tein (Howell and Usdin, 1997). The level of sequence identity 
between exon 2 and the corresponding region of the LlM3f 
consensus sequence is 91.1% and both consensus splice sites 
arose from point mutations (Fig. 4C). We screened the human 
EST database for potential alternative transcripts of the 
KIAA0173 gene and we found two ESTs harboring alternative 
5 7 UTR exons (Fig. 4B). The EST BM450150 contains a new 
exon, here referred to as exon lb, situated on the genomic 
sequence inside intron 1 and corresponding to a part of an LTR 
repetitive element belonging to the MER2 IB family. The 5' end 
of this exon is not known since exon 1, which is most probably 
the first transcribed exon, is not present in this EST. The EST 
AU14045 5 contains a previously undefined exon, named exon 
2b, which is, like exon 2, part of the LlM3f repeat from nucleo¬ 
tide 654 to 696 of the repeat. No exon upstream of exon 2 is 
present in this EST. These three exons are part of repetitive 
elements and are absent in the pig sequence. However, we can- 
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AG 


lb 


2b 


3 


exons 


D79995-.- 

AU140455 » 

B BM450150 —.- 



Exon lb: 


Human seq GGTAACAGCACGGAAGCTCCAAGCCCTTTTCCCCATCCCTTGCCCT|ATGCATC 

MER21B 


GGGGAGGGCACGGAAGCTCTGCGCCCCTTCTCCCATACCTCGCCCT 


ATGCATC 


ATGTATC 


Human seq tcttccatgtggctgttcatctgta|tcctttgtaata| 

MER21B 


TCTT-CATCTGGCTGTTCATCTGTA 


AATAGGTAAAT 


TCCTTTGTAATA TCCTTTATAATAAACNGGTAAAC 


Human seq GTAAGTGAAGTGTTTCCCTGAGTTCTGTGAG 
MER2IB GTAAGTAAAGTGTTTCCCTGAGTTCTGTGAG 


Exon 2 : 

Human seq TTCATTCTGTTTTCTTTTTTCTCCTTTG-TTGTATTTTCAGATAGACT 

LlM3f TTCATTCTTTTTTCTTTTTTCTCCTCTGACTGTGTATTTTCAAATAGCCT 

Human seq GTCTTCAAGCTCACTGATATTTTCCTCTGCTTGATCCATTGTGCTGTTGA 

LlM3f GTCTTCGAGCTCACTGATTCTTTCCTCTGCTTGATCCATTCTGCTGTTGA 

Human seq GAGCCTCTAGTAAATTTTTCAGGT AAGCAAATTTATTTCTCAGTTCTAAG 
LlM3f GAGCCTCTAATGAATTTTTCAGTTCAGCAAATGTATTTCTCAGTTCCAAG 


Exon 2b: 

Human seq TTTTTTGTAACTTAACCTGTTGACTTATCTTTTTTTTAAATCACTAGGTC 
LlM3f TTCTTTGTAATTT-ACCTGTTGATTTTCTTTTTTTTTTTCCCGCTAGGTC 

Human seq ACTACCTCTTTTTCAACACTAGATGGCATCTTGGCCCCAGGT TTGTCTTG 

LlM3f GCTGCCTCCTTTTCGGCACTAGATGGCGCCTTAAGCCCAGGTTTGCCTCG 


Fig. 4. (A) Genomic structure of the 5 7 region of human KIAA0173. Positions of several interspersed repeats, exons, and splice 
sites are indicated. (B) Partial structure and GenBank accession numbers of human transcripts (D79995 is the full transcript of the 
gene; AU140455 and BM450150 are ESTs). (C) Sequence alignment of exons lb, 2, and 2b with the corresponding interspersed 
repeat consensus sequence (from RepeatMasker database). Exons are in bold, splice sites are underlined, and direct tandem 
repeats are boxed. 


not exclude that specific 5 7 UTR exons exist in the pig. Both 
exons lb and 2b show high sequence identity with the repeat 
consensus sequence (Fig. 4C), but in this case, the consensus 
splice sites (AG and GT) do not appear to have arisen from 
point mutations. Exon lb differs from the MER21B consensus 
by the copy number of two direct repeats (boxed on Fig. 4C). 
This is an obvious example of the creation of new exons by the 
insertion of repetitive elements into an intron. 

Discussion 

The PRKAG3 coding sequence as well as the 3' UTR are 
well conserved between pig, human, and mouse. Conserved ele¬ 
ments were also found in the 5 7 region of the gene, one of them 
harboring a putative promoter element with several consensus 


binding sites for muscle-specific transcription factors, support¬ 
ing the notion that this represents the correct promoter. The 
sequence comparison of the 130-kb region surrounding 
PRKAG3 between human and pig reveals structure and se¬ 
quence similarity affecting both genes and repetitive sequences. 
Three complete genes (including PRKAG3) and one partial 
gene are present in this region spanning approximately 140 kb 
in humans. This is higher than the average gene density esti¬ 
mated to be ~ 1 gene/100 kb, but is consistent with the high GC 
content of the region (45 %). Indeed, genes tend to cluster in 
GC-rich regions (International human genome sequencing con¬ 
sortium, 2001). 

The extensive comparison of repetitive sequences based on 
the draft sequences of the human and mouse genomes revealed 
an average sequence identity of dispersed ancestral repeats of 
66.7% (Mouse genome sequencing consortium, 2002). We 
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observed a significantly higher average sequence identity 
(72.4%) for the 12.6 kb of ancestral repeats that were alignable 
in our pig-human comparison. The striking sequence identity 
observed for the ancestral Mammalian Interspersed Repeats 
(MIR) shared between pig and human is noteworthy (79.8 % for 
about 70% of their length; Table 2). The higher sequence iden¬ 
tity observed in the human/pig comparison was observed 
despite the fact that recent molecular phylogenetic analyses, in 
contrast to earlier studies, indicate that mouse and human div¬ 
erged from a common ancestor subsequent to the split from the 
common ancestor shared with the pig (Murphy et al., 2001; 
Madsen et al., 2001). However, the observed higher sequence 
identity for ancestral repeats shared between pig and human 
than for those shared between mouse and human is most likely 
explained by the high average substitution rate in the rodent 
lineage. The analysis of human/mouse repeats indicated a two¬ 
fold higher substitution rate in mouse compared with human 
(Mouse genome sequencing consortium, 2002). Our data indi¬ 
cate that the substitution rates in the human and pig lineages 
are similar since the observed divergence from ancestral re¬ 
peats for aligned human and pig LINE L2 repeats is consistent 
with the estimation of 25-35% for the whole human genome 
(International human genome sequencing consortium, 2001). 

The observation that most of the LINE L2 repeats detected 
in this study were shared between pig and humans reflects that 
they were inserted before the split between these two species, 
whereas half of the LINE LI and all species-specific SINEs 
appear to have been inserted after the divergence from a com¬ 
mon ancestor. The pig/human comparison revealed that repeti¬ 
tive sequences constituted a larger fraction of this genomic 
region in humans (51.4%) than in pigs (38.6 %). This is because 
this human region contains significantly more SINE and LTR 
sequences than the corresponding region in pig. The recent 
sequencing of the mouse genome points out that similar types 
of repeat sequences are found in the corresponding genomic 
regions in both species, reflecting an influence of the genomic 
environment in the insertion of repeats. We noticed the pres¬ 
ence of large clusters of repetitive sequences in human but sig¬ 
nificantly less in the pig sequence. This is in agreement with the 
observation made by Thomsen and Miller (1996) who noticed 
that the pattern of differential distribution of LINEs and SINEs 
is less pronounced in the pig than in human and mouse. Indeed, 
human LINEs and SINEs are predominantly located in GC- 
poor and GC-rich regions, respectively, but they appear to be 
more evenly dispersed in the pig genome. Moreover, the fact 
that the frequency of LINEs (14.4%) is lower and the frequency 
of SINEs (22.2%) is significantly higher in this region, com¬ 
pared with the average for the human genome (21 and 13 %, 
respectively), is consistent with the fact that this region has a 
fairly high GC level (45.5 compared to 41 % on average in the 
human genome). 

The KIAA017 3 gene displays an important level of structur¬ 
al variation between pig and human that appears to be closely 
associated with interspersed repeats. Some 5' UTR exons in 
human are derived from repetitive elements which have been 
introduced after the divergence of the human and pig lineages. 
These exons showed very high sequence identity with the 
repeat they originated from and the observation demonstrates 


that fragments of repeats can be inserted in the transcribed 
sequence of neighboring genes. In this case, the inserted exons 
are situated in the 5' UTR, but one can imagine that the same 
phenomenon can occur inside coding sequences. LI repeats 
propagate by reverse transcription and this step often fails to 
proceed to the 5' end, resulting in many truncated insertions. 
Indeed, most of the LINEs identified in the region only harbor 
their 3' end. Interestingly, it is the opposite for the LlM3f 
repeat, containing KIAA0173 exon 2 and 2b, which only con¬ 
sists of the 5' end of the repeat. We are apparently facing a very 
recent propagation of several repetitive sequences, the LlM3f 
repeat and the subsequent insertion of SINEs into this repeat. 
The impact of mobile repetitive elements on gene and genome 
evolution is now acknowledged and several cases have been 
documented. LTR (Long Terminal Repeats) elements have 
been shown to increase the transcription of the endothelin B 
receptor and apolipoprotein C-I genes in human by serving as 
an alternative promoter (Medstrand et al., 2001). A truncated 
bovine LINE has been included as part of the coding sequence 
and the 3' UTR of the bovine BCNT gene (Craniofacial Devel¬ 
opment Protein) (Iwashita et al., 2001). Several examples of 
transposition events resulting in disruption of functional genes 
have been characterized (Sheen et al., 2000). This shows that 
exons and other important elements can be missed when using 
masked sequences in comparative analyses and/or gene predic¬ 
tion. 

With the achievement of obtaining the human genome 
sequence, it is now possible to see that segmental duplication, 
point mutations, and chromosomal rearrangements appear to 
be important factors in genome evolution (Samonte and Eich- 
ler, 2002). About 5 % of our genetic material is composed of 
segmental duplications involving regions of genomic DNA 
ranging from 1 to 200 kb (International human genome 
sequencing consortium, 2001). These duplications affect all 
regions of the genome, both genes and high-copy number 
repeats. When a gene or part of a gene is duplicated, resulting in 
the formation of a new or chimeric gene, the term exon shuf¬ 
fling is used. Exon shuffling was suggested as a general event for 
the origin of new genes (Patthy, 1999; Long, 2001). However, 
the molecular mechanisms underlying these rearrangements 
remain poorly understood. Illegitimate recombination, LINE 
LI element-mediated recombination, and retrotransposition 
appear to be the major mutation events responsible for exon 
shuffling and they can, in some cases, explain the generation of 
new genes (Courseaux and Nahon, 2001). The transposition of 
a fragment containing a copy of the porcine KIAA017 3 exon 15 
into intron 1 of CYP27A1 can be classified as an exon shuffling 
event since it creates (or could create) a chimeric gene. Retro- 
position can be excluded as the causative mechanism since the 
duplication concerns a genomic segment. The duplication oc¬ 
curs inside a LINE-1 repeat. However, the fact that the original 
copy of the segment was not surrounded by any interspersed 
repetitive element suggests that we are not facing an inter¬ 
repeat recombination. The presence of a short direct repeat 
flanking the duplication point suggests an illegitimate recombi¬ 
nation event. The initial event could have been a double strand 
break with protruding ends followed by a sister chromatid 
exchange or a recombination with the chromosome homolog. 
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This model mechanism was proposed to explain rearrange¬ 
ments involving minisatellite structures for which a duplica¬ 
tion flanking the mutational event point is often observed (De- 
brauwere et ah, 1999; Vergnaud and Denoeud, 2000). Since the 
original and duplicated copies are only about 20 kb apart on the 
genome, a sister chromatid exchange is possible. 
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Abstract. The porcine COL10A1 gene, encoding the al(X) 
chain of type X collagen, has been sequenced. The gene struc¬ 
ture is evolutionarily conserved, consisting of three exons and 
two introns spanning 7100 bp. Linkage mapping localized the 
gene to chromosome 1, which is in agreement with human-pig 


homology maps. Furthermore, protein structure comparison of 
the functionally important carboxyl domain between species 
revealed that amino acid changes were few and mainly situated 
in loop regions. 

Copyright©2003 S. Karger AG, Basel 


The proteins of the collagen superfamily, which comprises 
at least 19 different types of collagen and an additional ten dif¬ 
ferent proteins with collagen-related domains, are major com¬ 
ponents of cartilage and bone (Prockop and Kivirikko, 1995). A 
striking manifestation of the critical role played by collagens is 
the wide spectrum of diseases that are associated with dysfunc¬ 
tion of these proteins, including disorders of the skeleton 
(Mundlos and Olsen, 1997a, b). Among these is the human 
dwarf phenotype called Schmid metaphyseal chondrodysplasia 
(SMCD), which is caused by mutations in type X collagen 
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(Warman et al., 1993). Type X collagen is a homotrimer con¬ 
sisting of three al(X) chains and it is noteworthy that most, if 
not all, pathological mutations impinge on the structural integ¬ 
rity of the trimer by weakening or even fully disrupting al(X) 
chain interactions. Thus, it is significant that 25 of 27 muta¬ 
tions described in human type X collagen are situated in the 
carboxyl non-collagenous NCI domains (Chan and Jacenko, 
1998), which are responsible for the initial protein-protein 
interactions leading to assembly of the trimeric structure 
(Zhang and Chen, 1999). Destabilized trimers appear not to be 
secreted by chondrocytes into the extracellular matrix, which 
suggests that the observed abnormal growth plate function is 
likely a result of type X collagen haploinsufficiency (Chan et al., 
1998) although dominant interference of the mutated chains 
may play a role as well. 

We have previously demonstrated that a non-conservative 
amino acid substitution in type X collagen in pigs also results in 
dwarfism (Nielsen et al., 2000). This finding is important 
because existing murine models of SMCD show little pheno¬ 
typic resemblance to SMCD (Rosati et al., 1994; Kwan et al., 
1997), suggesting that the dwarf pigs provide a unique opportu¬ 
nity to get insight into the role of type X collagen. Thus, histo¬ 
logical examinations revealed malformations in the growth 
plate architecture characterized by a widely expanded hyper¬ 
trophic zone containing disorganized columns of chondrocytes, 
and accumulation of cartilage in the metaphyses. This suggest- 
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ed that type X collagen serves structural roles in the extracellu¬ 
lar matrix by providing the molecular environment required 
for endochondral bone formation. To facilitate future studies 
on the role of type X collagen in normal and diseased states in 
the pig animal model, we here report the chromosome location, 
the complete genomic organization and single nucleotide poly¬ 
morphisms (SNPs) of the porcine COL10A1 gene. The data 
confirm the strong evolutionary conservation of both the gene 
structure and amino acid sequence. 


Materials and methods 

PCR amplification 

We designed primers for PCR amplification of the pig COL10A1 gene by 
alignment of the genomic sequences of human and mouse COL10A1 genes to 
identify stretches of highly conserved nucleotide sequences. PCR was per¬ 
formed in a total volume of 10 pi, using 100 ng of genomic DNA. The PCR 
profile was as follows: 94 0 C for 3 min, 30 cycles of 94 0 C for 30 s, 60 ° C for 
30 s, 72 ° C for 90 s, and finally elongation at 72 0 C for 7 min. 

DNA sequencing 

All DNA sequencing was accomplished using the BigDye Terminator 
Cycle Sequencing Kit (PE Applied Biosystems) according to supplier’s 
instructions. Sequencing products were analyzed on an ABI 377 Sequencer 
and the resulting sequences were assembled using SEQUENCHER Version 
4.0.5 (Gene Codes Corporation). The porcine COL10A1 sequence has been 
deposited in the GenBank database (Accession number AF222861). 

SNP identification 

The 3 / -end of the COL10A1 gene was PCR amplified using the primers 
5 / -TTCAGCCTACCTCCATATGCAT-3 / and 5 -C ARC AGC A YTAYGAC- 
CC-3' (R = A or G; Y = C or T). The PCR conditions were: 94 °C for 3 min, 
30 cycles of 94 °C for 30 s, 57 0 C for 30 s, 72 0 C for 90 s, and finally elonga¬ 
tion for 7 min at 72 °C. The PCR fragments were subsequently sequenced 
and SNPs were identified by alignment of the sequences using SEQUENCH¬ 
ER Version 4.0.5 (Gene Codes Corporation). 

Linkage mapping 

The six three-generation PigMaP families (Archibald et al., 1995) were 
screened to identify a restriction fragment length polymorphism at the 
COL10A1 gene using Southern blot analysis as described (Nielsen and 
Thomsen, 2000). A total of 35 informative meioses were observed with a 
polymorphic Taql site and linkage analysis using CRIMAP was done by 
combining the COL10A1 genotypes with marker information in the PiG- 
MaP Consortium ResPig Database (Archibald et al., 1995). 

Protein model 

The coordinates of the human NCI crystal structure (Protein Data Bank, 
accession code lgr3) were applied in order to illustrate the differences 
between the human and the porcine NCI domains. The differences were 
highlighted using the program Protein Explorer. 


Results 

Characterization of the genomic organization of the porcine 

COL 10Al gene 

Alignment of the genomic sequences of the human and 
mouse COL10A1 genes revealed highly conserved regions at 
the nucleotide level. This information was applied in the design 
of primers, which enabled us to PCR amplify overlapping frag¬ 
ments covering the entire porcine gene. The fragments were 
sequenced and annotated based on sequence alignment with 


the mouse and human COL10A1 genomic sequences (Table 1). 
The size of porcine COL10A1 from the putative TATA-box to 
the termination codon is approximately 7100 bp. Thus, exon 1 
in pigs encodes 82 bp of the 5'-end untranslated sequences, 
which is similar to the situation in humans where exon 1 
(80 bp) codes for only untranslated sequences. The size of 
intron 1 is 613 bp and comparison of the sequence to that of 
human (526 bp) showed that they share significant homology in 
two regions (90 bp of 80% identity and 28 bp of 96 % identity), 
suggesting that the pig intron 1 may contain elements with 
enhancer functions as observed in humans. The second exon is 
172 bp long whereas the size in humans is 169 bp and it codes 
for 15 bp of 5'-end untranslated sequence, a signal peptide of 19 
amino acid residues as well as 33 amino acid residues of the 
amino-terminal non-collagenous NC2 domain. Exon 2 and 
exon 3 are separated by a large intron in both pigs (4252 bp) 
and humans (3639 bp). Sequence analysis of intron 2 revealed 
the presence of a PRE-1 type short interspersed element 
(SINE), which is widespread in the porcine genome with copy 
estimates between 5 * 10 4 and 2 * 10 6 (Alexander et al., 1995). 
Exon 3 has a length of 1871 bp which compares well with the 
1888 bp of the corresponding exonic sequence in humans, and 
it codes for five amino acids of the NC2 domain, the complete 
triple-helical domain of 463 amino acids and the carboxy-ter- 
minal NCI domain of 155 amino acid residues. All the exon- 
intron boundaries have the consensus GT-AG splice junctions 
(Table 1). 

Single nucleotide polymorphisms (SNPs) in the porcine 

COL10A1 gene 

Much effort is currently focussing on the identification of 
SNPs in farm animals. We subjected the COL10A1 gene to 
SNP discovery by sequencing the 3'-end in 81 unrelated ani¬ 
mals sampled from diverse pig breeds including Landrace, 
Duroc, Hampshire, Yorkshire, and Black-white Pied. Two 


Table 1. Genomic organization of the porcine COL10A1 


Exon Exon size Exon-intron boundary Intron size Intron-exon boundary 
(bp) (bp) 


1 82 C A AC AT CC AGgtaag 613 ttcagAATCCATCTGC 

2 172 A AG AGT A A AGgtaaa 4252 tttagGT AT AT C ACT A 

3 1871 GCTCCAATGTGA 


The exon and intron sequences are labeled in upper and lower cases, respectively. 
The conserved gt/ag exon-intron junctions are shown in boldface type. The size of 
the gene from exon 1 to exon 3 is approximately 7000 bp. 


Fig. 1 . Multiple amino acid sequence alignment of type X collagen. The 
sequences are Sus scrofa (GI: 7141254), Homo sapiens (GI: 18105032), Bos 
taurus{ GI: 27807139), and Mus musculus (GI: 50481). The conserved amino 
acids among the aligned sequences are marked with asterisks (*). In cases of 
non-conserved amino acids the actual change is written in one letter code. 
The eight imperfections in the Gly-X-Y triplets of the collagenous domain 
are overlined. Vertical arrows (I) indicate the start of the triple helix and the 
NCI domain. 
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Fig. 2. Three-dimensional structure for the 
human collagen X NCI monomer presented in 
cartoon mode. The NCI monomer is shown in 
two different orientations, A and B. The N-termi- 
nal 28 amino acids of the human NCI domain 
cannot be viewed on the figure since these are dis¬ 
ordered in the crystal structure. The monomer 
unit consists of ten (3-strands connected by loops. 
Amino acid variations between the human and 
porcine NCI monomers are represented with 
balls for atoms and sticks for bonds (blue: nitro¬ 
gen, red: oxygen). 



polymorphic sites, a transition COL10_1 (C/T) and a transver¬ 
sion COL10_2 (G/T), were located in the coding region of exon 
3 (position 6982) and 30 bp downstream of the termination 
codon. The coding SNP was a synonymous substitution. Both 
SNPs were represented in all breeds except for Hampshire pigs, 
which were homozygous at these sites. The calculated allele fre¬ 
quencies are summarized in Table 2. 

Chromosomal localization of porcine COL10A1 

Next, we determined the chromosome location of 
COL10A1 by linkage analysis. The three-generation PiGMaP 
reference families (Archibald et ah, 1995) were genotyped for a 
Taql restriction fragment length polymorphism by Southern 
blotting using a fragment of the 3'-end of exon 3 as hybridiza¬ 
tion probe. The presence of a polymorphic Taql restriction site 
at position 7018 (G/A) was confirmed when genomic fragments 
containing the 3'-end were sequenced. The human COL10A1 
gene is known to be located in HSA6q21 ^22.3, and available 
comparative human-pig gene maps therefore predicted that the 
porcine ortholog is positioned on pig chromosome 1 (Thomas 
et al., 1991; Goureau et ah, 1996; Yerle et al., 1997; Chaudhary 
et al., 1998). For that reason, the COL10A1 genotypes were 
merged with the data on SSC1 loci of the PiGMaP Consortium 
ResPig Database and analyzed using CRIMAP. This yielded 
twopoint LOD scores of 9.33 and 8.13 for microsatellite mark¬ 
ers S0396 and S0312, which assigned the COL10A1 gene to 
SSC1. 

Interspecies comparison of the amino acid sequences of type 

X collagen 

From the genomic nucleotide sequence of the COL10A1 
gene we deduced the primary amino acid sequence and com¬ 
pared it with the primary sequences of human, cow and mouse 
(Fig. 1). The alignment revealed that the porcine al(X) chain 
possesses a signal peptide of 19 amino acids as opposed to 18 


Table 2. Allele-frequencies of the SNPs COL10_1 and COL10_2 in a pig 
breed-panel. The SNPs identified are COL10_1 (C/T) in codon 642, and 
COL10_2 (G/T) 30 bp downstream of the termination codon 


Breed 


No. of 
animals 


Allele frequencies 


COL 10 1 


COLIO 2 




C 

T 

G 

T 

Landrace 

18 

0.39 

0.61 

0.89 

0.11 

Duroc 

18 

0.78 

0.22 

1.0 

0 

Hampshire 

19 

1.0 

0 

1.0 

0 

Yorkshire 

17 

1.0 

0 

0.94 

0.06 

Black-white Pied 

9 

0.33 

0.67 

0.39 

0.61 


residues in the other species. Furthermore, the triple-helical 
domain is 463 amino acids long, and it is noteworthy that eight 
imperfections in the repeating Gly-X-Y triplet are situated in 
identical positions in the four species and, hence, conserved 
through evolution. The carboxy-terminal NCI domains are six 
residues shorter in the pig and cow relative to the length of the 
human and mouse counterparts. Since the structure of the NC1 
domain is crucial to the function of type X collagen, it is of 
interest to compare sequences from different species in the con¬ 
text of a three-dimensional model. To this end, we utilized data 
from the crystal structure of the human NCI domain to show 
the variations between the human and porcine NCI domain 
(Fig. 2). The first 28 residues of the human NCI domain are 
not visible on the three-dimensional crystal structure presum¬ 
ably because the N-terminus is structurally disordered. The 
NCI monomer consists of a 10-stranded (3-sandwich with jelly- 
roll topology. The (3-strands are connected by loops, most of 
them being tight reverse turns (Bogin et al., 2002). In Fig. 2, the 
ten amino acids that differ between the two species are present¬ 
ed as ball and sticks, and it is striking that the differences, 
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except for three, are all located in loop regions of the NC1 struc¬ 
ture. Three of the polymorphisms (positions 598, 605, and 637) 
are situated within (3-strands, however, these amino acid 
changes are conservative and, therefore, unlikely to change the 
structure significantly. Furthermore, inspection of the primary 
sequence alignment shows that these ten positions are highly 
variable across all four species whereas the flanking segments 
are conserved, suggesting that loops in NCI are structurally 
flexible. 

Discussion 

Type X collagen is expressed specifically in the hypertro¬ 
phic chondrocytes during endochondral ossification. In hu¬ 
man tissue, the tightly regulated, cell-specific expression pat¬ 
tern is achieved by a concerted action of positive and negative 
cw-acting elements in the promoter of COL10A1 (Beier et al., 
1997). In the chicken, transcriptional repression seems to be 
the main regulatory mechanism, where the region immediate¬ 
ly upstream of the intitiation site acts to restrict expression to 
hypertrophic chondrocytes (Long and Linsenmayer, 1995). 
Our analysis of the porcine COL10A1 gene includes approxi¬ 
mately 100 bp upstream of the transcriptional initiation site 
and alignment to the human sequences revealed 84% se¬ 
quence identity, showing that the proximal promoter is high¬ 
ly conserved. Furthermore, the organization of the pig 
COL10A1 gene shows that it has the typical condensed struc¬ 
ture observed also in other species with conservation of both 
length and positions of the exons. Thus, the gene covers 
approximately 7100 bp and it consists of only two introns and 
three exons, where exon 3 encodes more than 90% of the pro¬ 
tein. Two SNPs were identified, one in exon 3 and another 
30 bp downstream of the termination codon. SNPs are amen¬ 
able to high-throughput analysis and, therefore, an attractive 
type of DNA marker for animal identification, paternity test¬ 
ing and genome scans for quantitative trait loci (QTL) and 
disease genes. Pig chromosome 1 has been shown to harbor 
several QTL for carcass traits (Malek et al., 2001; Nezer et al., 
2002), and the identification of two SNPs on SSC1 may con¬ 


tribute to fine-mapping of candidate genes. All the exon- 
intron junctions applied to the GT-AG rule. Collectively, the 
data confirm that the COL10A1 gene structure is highly con¬ 
served throughout evolution. In addition, linkage mapping 
assigned the pig COL10A1 gene to SSC1, thereby verifying the 
chromosomal location inferred from the human-pig compara¬ 
tive gene maps. 

Interspecies amino acid comparison showed that pig type X 
collagen has a highly conserved primary sequence, including 
the position of eight Gly-X-Y imperfections in the triple-helical 
domain. Evolutionary conservation of these sites indicates that 
they are essential for type X collagen function. In humans, two 
of the imperfections have been shown to be susceptible to 
cleavage by interstitial collagenases, and it was suggested that 
proteolytic degradation at these sites is important for turnover 
of type X collagen during cartilage development (Welgus et al., 
1990). Projection of the amino acid variations between the por¬ 
cine and human NCI domain onto the three-dimensional 
structure showed that most differences were located in loop 
regions. This is consistent with the flexibility of loop regions in 
general, which allows a range of amino acids to occupy specific 
positions without disrupting the structure (Branden and Tooze, 
1999). This is emphasized by the observation that the charged, 
more rigid Arg578 and Glu658 in humans substitute the struc¬ 
turally flexible and small Gly573 and Gly653 in the porcine 
al(X) chains. The (3-strands of the subunit are, to a large extent, 
conserved among the species as only three amino acid residues 
vary in these structural elements. Since these amino acid 
changes are conservative, the substitutions can, in these cases, 
be structurally tolerated without affecting any function. Taken 
together, the data show that the sequence of the al(X) chain 
encoded by COL10A1 is evolutionarily highly conserved. Fur¬ 
thermore, the projection showed that amino acid substitutions 
acquired by the NCI domains during evolution were largely 
restricted to loop regions, suggesting that the neighboring (3- 
strands are highly sensitive to changes. These observations, 
together with the fact that the NCI domain constitutes a muta¬ 
tional hotspot, emphasize the importance of structure for the 
function of the NCI domain. 
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Abstract. In contrast to human embryos, there are very few 
studies published on the frequency of chromosomal aneuploidy 
in farm animals. The objectives of this study were to apply a 
three-color fluorescent in situ hybridization (FISH) method for 
evaluating aneuploidy in porcine embryos using chromosome- 
specific DNA probes, establish baseline frequencies of aneu¬ 
ploidy in embryos and compare the results with our previous 
findings of aneuploidy in spermatozoa and oocytes. The em¬ 
bryos were collected from superovulated gilts, which were 
slaughtered 48 h after insemination. FISH was performed using 


probes specific for the centromeric regions of porcine chromo¬ 
somes 1,10 and Y. Altogether 403 blastomeres from 114 por¬ 
cine embryos were successfully investigated. Diploidy was ob¬ 
served in 101 (88.6%) embryos, triploidy in 2 (1.8%) embryos, 
mosaicism/mixoploidy in 9 (7.9%) embryos, and trisomy for 
chromosomes 1 or 10 in 2 (1.8%) embryos. No blastomere 
showed aneuploidy for chromosome Y. These findings corre¬ 
spond with the frequencies of aneuploidy we have found pre¬ 
viously in porcine germ cells. 

Copyright©2003 S. Karger AG, Basel 


Embryonic mortality has a substantial impact on the fertili¬ 
ty of domestic animals, with most of these losses occurring dur¬ 
ing the first days after fertilization. Non-infectious causes, such 
as chromosomal aberrations, hormonal imbalances, nutritional 
factors, etc. represent about 70% of the cases of embryonic 
mortality (Vanroose et al., 2000). It is evident that chromosom¬ 
al aberrations are a major cause of early pregnancy failure in 
animals (King, 1990) and humans (Pellestor, 1995). Chromo¬ 
some analyses on human embryos indicate that a significant 
proportion of these embryos are chromosomally abnormal, 
even morphologically good-quality embryos. The majority of 
embryos investigated in both human and farm animals have 
been produced in vitro, with human embryos being surplus IVF 
embryos and generally unsuitable for clinical use. These con- 
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cerns are a recognized limitation when interpreting results from 
surplus in vitro cultured embryos, as they may not be represen¬ 
tative, or reflect the true chromosomal complement of in vivo 
embryos. However, when investigating embryos from farm ani¬ 
mals these concerns can be overcome, since in contrast to 
human, embryos with good and poor morphology can be exam¬ 
ined from both in vitro and in vivo fertilization. 

Very limited data exist on the incidence and risk factors for 
aneuploidy in embryos of farm animals. Analyses of IVF results 
suggest that the incidence of aneuploidies in human embryos 
fertilized in vitro is significantly higher than the simple sum of 
aneuploidies reported in both human spermatozoa and oocytes 
(Pellestor, 1995). The frequency of aneuploidy in farm animals 
is probably similar to humans based on our previous findings 
from sperm and oocytes. Consequently, farm animals may pro¬ 
vide a good model for investigating aneuploidy in in vitro and 
in vivo embryos, which is also likely to be important in under¬ 
standing aneuploidy in human embryos. Originally cytogenetic 
analysis of embryos was possible only by conventional karyo¬ 
typing. Karyotyping is technically demanding, time consuming 
and limited by the number of metaphase cells available for 
analysis. Poor chromosome morphology and artifactual loss of 
chromosomes can compromise cytogenetic results. Most pre¬ 
vious studies on embryos have been performed on Giemsa- 
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stained chromosomes with results given only as total ploidy 
among investigated cells. Results obtained by karyotyping of 
pig embryos are conflicting; some authors detected a high and 
others a very low incidence of polyploid and mixoploid em¬ 
bryos. McFeely (1967) found 9.4% of abnormal embryos, Van 
der Hoeven et al. (1985) 7.3% and Long and Williams (1982) 
2.9%. Dolch and Chrisman (1981) did not find any aberrant 
embryo from a total of 169 embryos examined, whereas Moon 
et al. (1975) detected 26.6% of aberrant embryos. All quoted 
authors investigated older embryos - mostly ten days old. Mon¬ 
osomy occurrence was described in one of 76 embryos (Smith 
and Marlowe, 1971). Double trisomy was described by Ruzics- 
ka (1968) in one of four embryos examined. 

Advances in molecular biology have seen the introduction 
of fluorescence in situ hybridization (FISH) as an important 
method for analysis of numerical chromosome aberrations in 
interphase nuclei (Coonen et al., 1994; Munne et al., 1995). 
This method was used repeatedly for investigation of chromo¬ 
some disorders in bovine embryos (Viuff et al., 1999, 2000, 
2001, 2002). In our study, multi-color FISH was used reliably 
for aneuploidy detection of chromosomes 1,10 and Y in blas- 
tomeres of early porcine (whole) embryos. 


Material and methods 

Embryo collection and fixation 

Fifteen crossbreed gilts weighing approximately 100 kg, aged between 7 
and 8 months, were treated with 800 IU pregnant mare serum gonadotropin 
(PMSG) i.m. (Sergon, Bioveta, Ivanovice na Hane), followed 72 h later with 
0.05 mg Lecirelinum i.m. (Supergestran, Ferring-Leciva, Prague). The gilts 
were inseminated 48 h after luteinizing hormone releasing hormone (LH- 
RH) administration. Any animals still in oestrus 16 h after insemination 
were re-inseminated. Animals were slaughtered 48 h after insemination. 
Embryos were flushed from oviducts or uterus by phosphate-buffered saline 
(PBS). The number of embryos and blastomeres in each embryo was noted. 
Individual embryos were placed in 1 % sodium citrate hypotonic solution, 
washed in fixative solutions (0.01N HC1/0.1% Tween 20) and then trans¬ 
ferred to a drop of the same fixative solution on the slides covered with poly- 
D-lysine (Sigma Chemical Co., St. Louis, MO, USA). The fixative solution is 
used to simultaneously dissolve the zona pellucida and blastomere cyto¬ 
plasm, while at the same time the blastomeres/whole embryo are fixed to the 
slides (Coonen et al., 1994). 

DNA probes 

Chromosome-specific probes for porcine chromosomes 1,10 and Y were 
directly labeled and used for hybridization. Probes for chromosomes 1 and Y 
were prepared using DNA sequence data from the GenBank Nucleotide 
Sequence Database (http://www.ncbi.nlm.nih.gov). The cosmid S0045 
(Yerle et al., 1994) was used as probe for the centromeric region of porcine 
chromosome 10. For three-color FISH experiments, probes were labeled with 
fluorescein-11-dUTP and/or Cy3-dUTP (Amersham Life Science Inc., Ar¬ 
lington Heights, IL, USA). 

Fluorescence in situ hybridization 

Slides with blastomeres from fixed whole porcine embryos were dehy¬ 
drated in an ascending ethanol series (70, 85, 100%) for 2 min each and air- 
dried. A 10-pl sample of the combined probe hybridization mixture (pH 7.0) 
containing 55% formamide, lx SSC, 10% dextran sulfate (Sigma), 10 pg of 
sheared salmon sperm DNA (Sigma), 50 ng of probe for chromosome 1, 4 ng 
of probe for chromosome Y and a combination of 3 ng fluorescein/20 ng 
Cy3-labeled probes for chromosome 10 was placed on slides. The probe 
hybridization mix and target interphase nuclei were simultaneously dena¬ 
tured (under 24 x 24 mm 2 coverslips, sealed with rubber cement) at 73 0 C for 
5 min and allowed to hybridize overnight at 37 0 C in a moist chamber. 


The slides were washed at 45 °C in 50% formamide/2x SSC followed by 
2x SSC for 10 min each. Nuclear DNA was counterstained with DAPI (Sig¬ 
ma) in antifade solution (Vector Laboratories, Burlingame, CA, USA). 

Slides were examined using an Olympus BX60 fluorescence microscope 
equipped with a DAPI/FITC/Texas Red triple bandpass filter. The signals 
were recorded separately with single DAPI, FITC, Texas Red bandpass fil¬ 
ters using a CCD camera. Image analysis was performed with the In Situ 
Imaging System (ISIS) (META systems GmbH, Germany). 

Scoring criteria 

Only intact and non-overlapping embryonic nuclei were scored. Analysis 
of the FISH signals was performed according to the criteria set by Martini et 
al. (1997). Minor hybridization spots that had much lower fluorescence 
intensity were not scored, and spots that were found in close proximity to one 
another, interconnected or in paired arrangement, were counted as one sig¬ 
nal. A nucleus was considered diploid if it presented with either 2 + 2,2 + 1 or 
2 + 0 signals for either of chromosomes 1 and 10. An embryo was considered 
trisomic or tetrasomic if each blastomere presented 3 or 4 signals for either 
chromosomes 1 or 10, and two for the other chromosome. A triploid embryo 
was scored if at least half of the blastomeres showed three signals for both 
chromosomes 1 and 10, and the remaining blastomeres showed 3 signals for 
at least one of these chromosomes. Embryos were classified as mosaic if at 
least one blastomere showed disomic signals for chromosomes 1 and 10 (2n, 
diploid) and other blastomeres showed trisomic, tetrasomic or polyploid 
hybridization signals. 


Results 

A total of 142 embryos were obtained from 15 gilts, but only 
114 embryos were successfully examined, as the remaining 
embryos were not available due to technical failures. Embryos 
ranged from the 2- to 8-cell stage, with the majority comprising 
4-cell (55%) and 6-cell (22%) staged embryos. 403 blastomeres 
from 114 embryos were successfully investigated (Tablet). 
Only eight gilts had normal embryos with the correct number of 
assessed chromosomes (Fig. 1). In six other gilts the proportion 
of normal embryos ranged from 80 to 90%. One gilt showed 
50% of embryos to be abnormal, however this apparent high 
abnormality rate may be attributable to the limited number (4) 
of embryos investigated. Altogether, 101 (88.6%, 95% confi¬ 
dence interval [Cl] 82.7-94.5) embryos showed normal dip- 
loidy and 2 (1.8%, Cl -0.7-4.2) embryos showed triploidy 
(Fig. 2). Trisomy for chromosomes 1 or 10 was observed in 2 
(1.8%, Cl -0.7-4.2) embryos (Table 2). Fifty-two embryos 
(45.6%) showed one signal of chromosome Y, consistent with 
male sex. We did not find any cells aneuploid for chromosome 
Y. A rather high number of embryos (7.9%, Cl 2.9-12.9) 
showed mosaicism/mixoploidy for chromosomes 1 and/or 10. 
These embryos contained both trisomic, tetrasomic and poly¬ 
ploid cells. 

Discussion 

Although it has been reported previously that autosomal tri¬ 
somy is the most common chromosome aberration in human 
embryos (Boue, 1995), can the same trend be assumed in ani¬ 
mals? Published data concerning chromosome disorders in 
livestock embryos are very limited. Most studies have used 
conventional cytogenetic methods and report only total ploidy 
of cells and have not evaluated aneuploidy for individual chro¬ 
mosomes. 
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Fig. 1 . Two blastomeres showing two signals for chromosomes 1 (green), 
and 10 (yellow) each and one signal for chromosome Y (red). 



Fig. 2. Triploid blastomere presenting three 
signals for chromosomes 1 (green) and 10 (yellow) 
and one signal for chromosome Y (red). 


Table 1. Frequency of aneuploidies of 
chromosomes 1,10 and Y in porcine embryos 


Gilt No. 

No. of embryos 


No. of blastomeres 


No. of embryos 

collected 

analyzed 

aberrant 

analyzed 

normal 

aberrant 

with chromosome Y 

1 

2 

1 

0 

3 

3 

0 

1 

2 

8 

6 

0 

17 

17 

0 

3 

3 

9 

6 

0 

27 

27 

0 

2 

4 

6 

4 

0 

12 

12 

0 

0 

5 

9 

7 

0 

19 

19 

0 

3 

6 

3 

3 

0 

8 

8 

0 

3 

7 

10 

5 

1 

12 

10 

2 

2 

8 

22 

22 

3 

85 

79 

6 

13 

9 

9 

9 

1 

32 

31 

1 

5 

10 

4 

4 

2 

7 

4 

3 

1 

11 

13 

10 

0 

32 

32 

0 

2 

12 

2 

2 

0 

7 

7 

0 

1 

13 

15 

14 

2 

64 

62 

2 

8 

14 

21 

16 

3 

64 

60 

4 

7 

15 

9 

5 

1 

14 

12 

2 

1 

Total 

% 

142 

100 

114 

80.28 

13 

11.40 

403 

78.10 

383 

95.04 

20 

4.96 

52 

45.61 


Table 2. Cytogenetic findings in chromo- 
somally aberrant embryos 


Embryo 

No. 

Gilt 

No. 

No. of blastomeres 

— 

No. of aneuploid blastomeres 


Evaluation of 
embryo 

total 

analyzed 

normal 

trisomy 

1 

trisomy 

10 

tetrasomy 

1 

tetrasomy 

10 

1 

7 

2 

2 

0 

0 

2 

0 

0 

trisomy 10 

2 

8 

4 

4 

0 

4 

3 

0 

0 

3n 

3 

8 

4 

4 

3 

1 

0 

0 

0 

2n/trisomy 1 

4 

8 

4 

4 

3 

0 

0 

0 

1 

2n/tetrasomy 10 

5 

9 

4 

3 

2 

0 

1 

0 

0 

2n/trisomy 10 

6 

10 

4 

2 

0 

2 

1 

0 

0 

3n 

7 

10 

4 

2 

1 

1 

1 

0 

0 

2n/3n 

8 

13 

6 

5 

4 

0 

0 

1 

1 

2n/4n 

9 

13 

6 

4 

3 

1 

1 

0 

0 

2n/3n 

10 

14 

6 

3 

1 

2 

0 

0 

0 

2n/trisomy 1 

11 

14 

6 

4 

3 

0 

1 

0 

0 

2n/trisomy 10 

12 

14 

4 

4 

3 

0 

0 

0 

1 

2n/tetrasomy 10 

13 

15 

4 

2 

0 

2 

0 

0 

0 

trisomy 1 
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Table 3. Comparison of hyperploidy in porcine sperm, 
oocytes and embryos (%) 


Chromosomal abnormalities 

sperm 3 oocytes b 

embryos 0 

Plus one copy of chromosome 1 

0.08 1 

0.9 

Plus one copy of chromosome 10 

0.07 0.7 

0.9 

Plus one copy of chromosome Y 

0.09 n.a. f 

0 

Polyploid 

0.2 d 27.7 d 

1.8 e 


a Rubes etal. (1999). 

b Vozdova et al. (2001). 
c Present study. 
d Diploid. 
e Triploid. 

1 n.a. = not applicable. 


Table 3 shows the final results of our studies. Values ob¬ 
tained for embryos are not different from those estimated by 
extrapolating data from individual germ cells (oocytes and 
sperms). In previous studies we have determined the frequency 
of aneuploidy in porcine oocytes (Vozdova et al., 2001) and 
sperm (Rubes et al., 1999), using the same chromosome-spe¬ 
cific probes for chromosomes 1,10 and Y. A significant differ¬ 
ence between aneuploidy of chromosome 1 and 10 was not 
found in either germ cells or in embryos from pigs. From this 
data we hypothesize that the average frequency of aneuploidy 
for any pig autosome may be the same. Assuming there is no 
difference in the frequency of trisomy among any of the pig 
autosomes and after extrapolating data from this and our pre¬ 
vious studies, we proposed the frequency of trisomy for any 
single pig autosome in embryos to be 16.2%. However, in one 
of two embryos classified as trisomic in this study, it was possi¬ 
ble to examine only half of the cells (Table 2). Therefore it can¬ 
not be excluded that it was a mosaic embryo. Moreover, the 
number of embryos examined was not sufficient to assess the 
incidence of such a relatively rare phenomenon that trisomic 
embryos represent exactly. Presumed frequency of trisomic 
embryos will therefore likely be lower. It is necessary to test 
occasional differences in trisomy frequencies of respective 
chromosomes in higher numbers of embryos using probes for 
several chromosomes. Viuff et al. (2000) have published that 
frequency of trisomy/monosomy of chromosomes 6 or 7 is low 
(in the per millions rather than in the percentage level) in 
bovine embryos. They detected only one trisomy/monosomy 
mosaicism in a total of 426 bovine embryos examined. The 
authors suggest that the total proportion of non-balanced 
gametes does not reach more than a few percent in cattle. 

Viuff et al. (2001) have published results on the frequency of 
chromosome aberration in bovine embryos. Embryos included 
in our study (the mean number of cells per embryo was 4.5) are 
comparable with those obtained on day 2 postovulation (the 
mean number of cells per embryo was 4.7) by the authors men¬ 
tioned above. They found 5% of mixoploid embryos in that 
group. If we include the embryo classified as triploid (with only 
one half of blastomeres examined in our study) in the group of 
mixoploid embryos, the group examined in our study will then 
comprise 3.5% of mixoploid embryos. Frequency of polyploid 
embryos detected by the authors was very close to our findings 


(2 versus 1.8%). These authors used probes specific for cen- 
tromeric regions of bovine chromosomes 6 and 7. According to 
their scoring criteria these authors did not assess single chromo¬ 
some trisomy per se, but determined the frequency of ploidy 
levels (diploidy, triploidy, tetraploidy, etc.). Although we have 
used the same scoring criteria as Viuff et al. (2001) for deter¬ 
mining ploidy levels, an important difference in our study is 
that we have also determined single chromosome trisomy. As 
for comparison of our findings with previous data obtained in 
pigs by karyotyping, we can compare the total number of aber¬ 
rant embryos detected. Our finding of 11.4% approximates the 
data published by McFeely (1967) and by Van der Hoeven et al. 
(1985) who detected 9.4 and 7.3% of aberrant embryos, respec¬ 
tively. 

In contrast to humans, where gonosomal aneuploidy is com¬ 
monly found in embryos and newborns, aneuploidies of the 
porcine sex chromosomes are rare. Anomalies in the number of 
sex chromosomes have been described in a few individual pigs 
and involved only chromosome X (Chowdhary, 1998). Our 
results in porcine embryos are in agreement with these findings, 
as we did not observe any abnormalities in the number of the 
male sex chromosome. 

Thirty percent of human preimplantation embryos gener¬ 
ated by assisted reproductive technology contain a proportion 
of aneuploid cells (Wells and Delhanty, 2001). The proportion 
of bovine mixoploid embryos produced in vivo and in vitro 
have been reported to be 5 and 22 % two days post insemina¬ 
tion (PI), 31 and 42% five days PI, and 25 and 72% eight days 
PI, respectively (Viuff et al., 1999, 2000, 2001). 

Although numerical chromosome aberrations have been 
found in porcine embryos (Chowdhary, 1998), complete aneu¬ 
ploidy involving autosomes has not yet been observed in live- 
born pigs, suggesting that complete aneuploidy maybe incom¬ 
patible with life in pigs; however autosomal trisomy mosai¬ 
cism, involving chromosomes 18 and 14, has been previously 
observed in liveborn pigs (Chowdhary, 1998). Considering that 
complete, non mosaic numerical chromosome abnormalities 
have only been found at early embryonic stages of develop¬ 
ment, it has been suggested that trisomic or triploid porcine 
embryos fail to implant or arrest very early in embryonic devel¬ 
opment; the situation is probably more complicated for mixo¬ 
ploid embryos. Evsikov and Verlinsky (1998) suggest that if the 
proportion of aneuploid cells within embryos is under some 
threshold, that cavitation of the morula stage embryo takes 
place and initiates negative selection against aneuploid cells 
and development of the embryo continues. If the proportion of 
aneuploid cells exceeds such a threshold, then embryos can be 
arrested in development. They also note that blastocysts with 
mostly abnormal and aneuploid nuclei, with even major genet¬ 
ic abnormalities, are not necessarily excluded from preimplan¬ 
tation development to the blastocyst stage. 

Our results imply that chromosome aneuploidy can be an 
important cause of embryonal mortality in pigs. Considering 
that complete aneuploidies in liveborn piglets have not been 
observed, we estimate that the total level of embryonal mortali¬ 
ty caused by numerical chromosome disorders ranges between 
10 and 20% in pigs. These findings are in accordance with data 
published by Gustavson (1990) who estimated embryonic mor- 
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tality caused by chromosome abnormalities to be one-third of 
developing porcine embryos. Some of the earlier studies con¬ 
sidered by Gustavson were on developmental^ late stage em¬ 
bryos with polyploidy cells observed in trophoectoderm. Re¬ 
cent studies indicate that polyploidy is known to exist in the 
trophoectoderm of normally developing embryos (Viuff at ah, 
2002). Therefore the estimate of embryo mortality rate by Gus¬ 
tavson may be an overestimate and not be representative of 
normal porcine embryos. 

To our knowledge, this is the first report on the frequency of 
aneuploidy for chromosomes 1, 10 and Y in pig embryos 
detected by FISH. We detected baseline aneuploidy frequency 
from 114 embryos and have shown that 88.6% were normal 
diploid, 1.8% embryos were triploid and 1.8% embryos were 


trisomic for chromosome 1 or 10. No cell was observed to be 
aneuploid for chromosome Y. These findings were compared 
with results from our previous studies on aneuploidy in porcine 
sperm (Rubes et al., 1999) and oocytes (Vozdova et al., 2001). 
The sum of the frequency of aneuploidy in germ cells from our 
previous studies is in agreement with the level of chromosome 
aneuploidy observed in embryos in this study. 
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Equine genomics: galloping to new frontiers 
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Abstract. Analysis of the horse genome is proceeding at a 
rapid pace. Within a short span of 6-7 years, ~ 1,500 markers 
have been mapped in horse, of which at least half are genes/ 
ESTs. Health, performance and phenotypic characteristic are 
of major concern/interest to horse breeders and owners. Cur¬ 
rent efforts to analyze the equine genome are primarily aimed 
at developing critical resources (including an advanced gene 
map) that could readily be used in the near future to i) identify 


genes and mutations responsible for inherited equine diseases/ 
disorders and to formulate approaches for accurate diagnostics, 
therapeutics and prevention, ii) discover genes associated with 
various other traits of significance, e.g. fertility, disease resis¬ 
tance, coat color and athletic performance etc., and iii) use 
functional genomic approaches to identify gene regulatory 
events involved in the manifestation of various diseases. 

Copyright©2003 S. Karger AG, Basel 


Introduction 

Horse genomics was recognized as essential to veterinary 
medicine and horse breeding, several years after genome analy¬ 
sis programs were already initiated worldwide for cattle, pig, 
sheep and chicken. Although the horse is not an animal reared 
primarily for food and production, the industry has a strong 
economic impact through its role in sports and recreation. The 
significance of this is amply reflected by the fact that in USA 
alone, the equine industry’s annual production and economic 
impact on the 1996 US gross domestic product was $25.3 bil¬ 
lion and $ 112 billion, respectively (Barents, 1996). Health and 
welfare of the horse is obviously of primary concern for the 
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entire industry. Hence advances have to be made that can help 
develop approaches leading to better understanding of diseases, 
and improved ways for diagnosis and treatment. A highly 
developed gene map in the horse will be useful to identify genes 
responsible for these valuable traits associated with equine 
biology, health and performance. 

Consequently, in 1995 a workshop was formed to foster col¬ 
laboration and share resources among scientists with a goal of 
developing a gene map for the horse. The workshop has 
involved scientists from Europe, the United States, Canada, 
Australia, New Zealand, Japan and South Africa with financial 
support from the Dorothy Russell Havemeyer Foundation and 
the 1998-2003 USDA-NRSP8 initiative of the National Ani¬ 
mal Genome Program. The organization of the workshop, 
meetings and workshop resources are described at the website: 
http://www.uky.edu/AG/Horsemap/. 

Current status 

Despite a late start, horse genomics has benefited from the 
attention of a vigorous and effective workshop community. An 
overview of the current status of research in equine genome 
analysis will help take a grasp of the progress made to date. At 
present, the four central pillars of the horse gene map are the 
synteny map, the genetic linkage map, the cytogenetic map and 
the RH map. The comparative map, which forms the fifth pil¬ 
lar, is actually the joint outcome of these four maps. The major¬ 
ity of the published genomic information from these maps is 
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available on databases at two websites (Table 1). The database 
maintained at the Institut National de la Recherche Agronomi- 
que (INRA) lists 1,275 known loci in the equine genome, of 
which 1,021 are reported as mapped. Our estimates on the 
breakdown of these loci into the five major maps are shown in 
Table 1. 

Synteny map 

Five reports document generation of somatic cell hybrid 
(SCH) panels (Lear et ah, 1992; Williams et al., 1993; Bailey et 
ah, 1995; Raney et al., 1998; Shiue et al, 1999), however, the 
UC Davis panel (Shiue et al., 1999) has indisputably been the 
most significant contributor to the horse gene map. With 
around 450 markers analyzed on this panel, this resource has 
been central to the assignment of syntenic groups (comprising 
genes and microsatellites) to all equine chromosomes (Caetano 
et al., 1999a, b; Shiue et al., 1999; Lindgren et al., 2001a, b). 
The resource has also contributed to the development of the 
horse-human comparative map (Caetano et al., 1999a; Chowd- 
hary et al., 2003). 

Linkage map 

Three reference family resources contribute to the genera¬ 
tion of the linkage maps. These include: i), the Uppsala half-sib 
family (Lindgren et al., 1998) that provided the first autosomal 
linkage map of the horse genome by analysis of 140 genetic 
markers; ii), the International Horse Reference Family Panel 
(IHRFP) that analyzed 161 loci identifying 29 linkage groups 
on 26 autosomes (Guerin et al., 1999) during stage I and 342 
markers identifying 31 linkage groups (all autosomes) spanning 
2,262 cM, in stage II (Guerin et al., 2003); and iii), the Animal 
Health Trust (AHT) 3-generation full-sib family that analyzed 
353 microsatellites identifying 42 linkage groups spanning a 
distance of 1,780 cM over all autosomes and the sex chromo¬ 
somes (Swinburne et al., 2000). An updated analysis of markers 
on this resource is expected to report -700 microsatellites that 
are distributed on the equine genome (Dr. Matthew Binns; per¬ 
sonal communication). There is good agreement between the 
three linkage maps for location and order of markers. Approxi¬ 
mately one-third of the markers are shared between any two 
maps. 


Table 1. Map status and genomics resources for the horse 


Chromosomes 31 autosomes, X, Y (ISCNH, 1997 for ideogram) 

Genome size (predicted) 3,000 Mb 


Mitochondria genome 


17,000 kb (Xu and Arnason, 1994; Vila et al., 2001) 


Mapped markers (See text for references) 


Map type 

Total 

Type I 

Type II 

Synteny map 3 

479 

142 

337 

Genetic linkage map 3 

455 

45 

410 

Cytogenetic map 3 

400 

220 

180 

RH map 3 

730 

258 

472 

Comparative map 3 

447 

447 

— 


Resources 

EST sequences 
In GenBank 
In progress 
cDNA libraries 
BAC libraries 


~ 4,400 
~ 30,000 
~ 13 libraries 

3 available: Texas A&M (TAMU) 3x, Institut de la 
National Recherche Agronomique (INRA) 3x, 
Children’s Hospital of Oakland Research Institute 
(CHORI-241) 11 


Websites (databases) 

http://locus.jouy.inra.fr/cgi-bin/lgbc/mapping/common/main.pl?BASE=horse 
http:// ww w. thearkdb. org/ 


Workshop information 

http:// www.uky.edu/AG/Horsemap/ 


Many markers overlap on the maps, bringing the tally of mapped markers over 
1000 . 


al., 2003). This map provides one of the best marker coverage 
for any of the horse chromosomes. 

Cytogenetic map 

Since the first fluorescent in situ hybridization (FISH) map¬ 
ping reported in horse by Oakenfull et al. (1993), a lot has been 
done to rapidly expand the cytogenetic map in the horse (e.g., 
see Breen et al., 1997; Raudsepp et al., 1999; Godard et al., 
2000; Lear et al., 2001; Lindgren et al., 2001a, b; Mariat et al., 
2001; Raudsepp et al., 2001; Chowdhary et al., 2002; Milen- 
kovic et al., 2002; Raudsepp et al., 2002). Most recently, 81 new 
markers were added to this map to align the RH map to indi¬ 
vidual equine chromosomes (Chowdhary et al., 2003). Of these, 
16 were mapped to ECA17 (Lee et al., 2003) alone. 


RH map 

Two radiation hybrid panels have been made in the horse, 
including a 3,000-rad (Kiguwa et al., 2000) and a 5,000-rad 
panel (Chowdhary et al., 2002). The 5,000-rad panel has been 
extensively utilized first to obtain comprehensive RH and com¬ 
parative maps for ECA11 (Chowdhary et al., 2002) and EC AX 
(Raudsepp et al., 2002) and very recently to develop the first 
generation RH and comparative map of the equine genome 
(Chowdhary et al., 2003). This map, comprising a total of 730 
markers (258 type I and 472 type II), is the first comprehensive 
map of the horse that incorporates type I as well as type II 
markers, integrates synteny, cytogenetic and meiotic maps into 
a consensus map and provides the most detailed genome-wide 
information to date on the organization and comparative status 
of the equine genome. Lately a 1.4-Mb resolution map of 
ECA17 comprising a total of 75 loci has been published (Lee et 


Comparative map 

Initial efforts to develop regional comparisons between the 
horse and human genomes were reported by Chowdhary 
(1992). Later, the Zoo-FISH approach was used to delineate a 
gross whole genome horse-human comparative map (Raudsepp 
et al., 1996; Chowdhary et al., 1998). This landmark map is still 
an essential tool in all laboratories working on equine genom¬ 
ics. Since then, more informative comparative maps have been 
published in the horse (e.g., Caetano et al., 1999a, b; Chowdha¬ 
ry et al., 2002; Milenkovic et al., 2002; Raudsepp et al., 2002; 
Chowdhary et al., 2003). Of these, the latest map provides an 
expanded whole genome comparison between horse, human 
and mouse genomes. The newly published 1.4-Mb resolution of 
the map of ECA17 provides an improvement over previous 
maps and is among the most comprehensive comparative maps 
among the domestic species (Lee et al., 2003). 
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Table 2. Genes with molecular markers and 
applications 


Locus 

Description 

Breed 

Citation 

HYPP 

Hyperkalemic periodic paralysis 

Quarter Horse 

Rudolph et al., 1992 

SCID 

Severe combined immunodeficiency 

Arabian 

Shin et al., 1997 

OLWFD 

Overo lethal white foal disease 

Paint 

Metallinos et al., 1998; Santschi et al., 

1998; Yang et al., 1998 

H-JEB 

Herlitz junctional epidermolysis bullosa 

Draft 

Spirito et al., 2002; Milenkovic et al., 2003 

MC1R 

Extension (red/black color) 

All breeds 

Marklund et al., 1996 

MATP 

Cremello color 

All breeds 

Locke et al., 2001; Mariat et al., 2003 

G 

Gray color (map only) 

All breeds 

Henner et al., 2002; Locke et al., 2002; 
Swinburne et al., 2002 

TO 

Tobiano color (map only) 

Paint 

Brooks et al., 2002 

ASIP 

Agouti color 

All breeds 

Reider et al., 2001 

LP 

Appaloosa color (map only) 

Appaloosa 

Terry et al., 2003 

w 

Dominant white 

All breeds 

Mau, 2003 


Applications 

The horse gene map was initially used to identify genes 
responsible for traits showing simple Mendelian inheritance. 
Among the first analyzed were hyperkalemic periodic paralysis 
(HYPP; Rudolph et al., 1992), severe combined immunodefi¬ 
ciency disease (SCID; Bailey et al., 1997; Shin et al., 1997), 
overo lethal white foal disease (OLWFD; Metallinos et al., 
1998; Santschi et al., 1998; Yang et al., 1998) and Herlitz junc¬ 
tional epidermolysis bullosa (H-JEB; Spirito et al., 2002; Mi- 
lenkovic et al., 2003), for which comparative approaches were 
used to identify the responsible genes and the causal mutation 
(Table 2). At present, molecular diagnostic tests are available 
for these conditions. 

Coat color patterns are other simple genetic traits of interest 
to horse breeders. During recent years, comparative genomics 
and whole genome scanning approaches were used either to 
map or to develop DNA tests for a variety of coat colors, viz., 
extension (Marklund et al., 1996), cremello (Locke et al., 2001; 
Mariat et al., 2003), agouti (Rieder et al., 2001), gray (Henner et 
al., 2002; Locke et al., 2002; Swinburne et al., 2002), tobiano 
(Brooks et al., 2002) and appaloosa (Terry et al., submitted). 
These studies demonstrated the power and efficacy of the horse 
gene map to identify the hereditary components of traits of 
interest. The map has also found applications in ongoing 
research into fertility (sex-reversal; Kent et al., 1988; Vaughan 
et al., 2001), the cause and treatment of melanoma in ageing 
grey horses (Rieder et al., 2000), recurrent exertional rhabdo- 
myolysis (MacLeay et al., 1999) and polysaccharide storage 
myopathy (Valberg et al., 1996). Increased resolution of the 
map will help to identify loci responsible for these and other 
health-related conditions of interest to horse breeders. 


Long-term goals of equine genomics vs. the limitations in 
the current map 

The long-term goals of equine genome analysis are: 

1. Identification of genes and mutations responsible for 
inherited diseases/disorders and to formulate approaches for 
accurate diagnostics, therapeutics and prevention. 


2. Discovery of genes associated with various other traits of 
significance, e.g. fertility, disease resistance, coat color and ath¬ 
letic performance. 

3. Use functional genomic approaches to identify gene regu¬ 
latory events involved in the manifestation of various dis¬ 
eases. 

The present resolution of the horse gene map is definitely 
insufficient for long-term practical utility. Fundamental prob¬ 
lems of the map include: i) low density of mapped polymor¬ 
phic and gene-specific markers; ii) lack of polymorphism of 
microsatellite markers in the breeds or families currently being 
examined; iii) clear disparity in uniform distribution of differ¬ 
ent types of markers; iv) lack of much needed integration of 
maps obtained by different approaches; and v) inadequate 
alignment of the horse gene map with the human and mouse 
maps. 

The only solution to these problems is an overall, rapid and 
systematic expansion of the gene map. The new map will trigger 
worldwide interest in identifying the molecular basis of Mende¬ 
lian traits as well as help to move horse genetics into the realm 
of defining genetic influences on complex/polygenic traits such 
as chronic obstructive pulmonary disease, osteochrondrosis 
dessicans, infectious disease susceptibility and resistance, rac¬ 
ing speed, endurance, conformation, and behaviors. Further¬ 
more, a high-resolution map of the equine genome will also 
contribute to an improved comprehension of the comparative 
genome organization in horse and will significantly add (as a 
representative species of the order Perissodactyla) to our 
present knowledge of mammalian evolution. 

Future prospects 

Areas where future emphasis will most likely be: 

Map development 

There are very good prospects that, like in other important 
farm animal species, the gene map of the horse will rapidly 
expand during the coming 2-3 years. RH, genetic linkage and 
the cytogenetic maps will contribute the most. These maps will 
finely align the horse map to the human/mouse/rat maps as well 
as the maps of the domestic species for which whole genome 
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sequence data will be available during coming years (e.g., cattle, 
dog, chicken and pig). In concert effective bioinformatics tools 
will have to be developed to easily and rapidly assimilate 
equine genome information in relation to that of other species. 

Resource material and genetic analysis 

One of the major limitations for initiating detailed molecu¬ 
lar studies for several equine hereditary conditions is the lack of 
sufficient samples. This obviously hampers drawing accurate 
conclusions regarding the characteristics and mode of inheri¬ 
tance of individual conditions. Moreover, large datasets are of 
paramount importance in multigenic traits. Hence, a central 
resource dedicated to collection, storage and distribution of 
material will play a major role in facilitating research in this 
direction. 

Functional genomics 

One of the fundamental changes associated with diseases is 
a change in gene expression. Genes are up-regulated and down- 
regulated in response to inflammation, stress, infection, exer¬ 
cise and even changes in the daily routine. In some cases the 
changes involve inherited differences between individuals. In 
other cases the changes are fundamental to all horses. To inves¬ 
tigate gene expression we need to characterize the genes 


expressed in different tissues. It is anticipated that during com¬ 
ing years extensive DNA sequence information on expressed 
genes in the horse will be available. This will trigger develop¬ 
ment of microarrays to investigate global gene expression. 
Strong bioinformatics resource will have to be developed con¬ 
currently to compile, analyze and share this information. 

Conclusions 

Equine genomics is now in its most exciting phase. We can 
foresee a day when routine diagnostic tests in horses will 
involve collection of tissues to investigate gene expression or 
genotypes associated with various conditions. Possibility of 
identifying genes related to performance - a highly complex 
trait - may appear a long shot, but cannot be completely ruled 
out. It must, however, be kept in mind that developments 
through equine genome analysis will not supplant current 
methods of evaluation and diagnosis but will complement 
them. Breeders and veterinarians will not change the way they 
evaluate horses, but will have more tools at their disposal. For 
these reasons, the genomics race is worth the run, and the horse 
genome is already galloping. 
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Abstract. A physical map of ordered bacterial artificial 
chromosome (BAC) clones was constructed to determine the 
genetic organization of the horse major histocompatibility 
complex. Human, cattle, pig, mouse, and rat MHC gene 
sequences were compared to identify highly conserved regions 
which served as source templates for the design of overgo prim¬ 
ers. Thirty-five overgo probes were designed from 24 genes and 
used for hybridization screening of the equine USDA CHORI 
241 BAC library. Two hundred thirty-eight BAC clones were 
assembled into two contigs spanning the horse MHC region. 
The first contig contains the MHC class II region and was 
reduced to a minimum tiling path of nine BAC clones that span 
approximately 800 kb and contain at least 20 genes. A mini¬ 


mum tiling path of a second contig containing the class III/I 
region is comprised of 14 BAC clones that span approximately 
1.6 Mb and contain at least 34 genes. Fluorescence in situ 
hybridization (FISH) using representative clones from each of 
the three regions of the MHC localized the contigs onto 
ECA20q21 and oriented the regions relative to one another and 
the centromere. Dual-colored FISH revealed that the class I 
region is proximal to the centromere, the class II region is dis¬ 
tal, and the class III region is located between class I and II. 
These data indicate that the equine MHC is a single gene-dense 
region similar in structure and organization to the human 
MHC and is not disrupted as in ruminants and pigs. 

Copyright©2003 S. Karger AG, Basel 


The major histocompatibility complex (MHC) is the most 
gene-dense region of the human genome (The MHC Sequenc¬ 
ing Consortium, 1999) and contains many genes that function 
in the innate and adaptive immune responses (Trowsdale, 
2001). By convention, the MHC is divided into three regions 
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(class I, class II, and class III) based upon gene function. The 
definitive genes of the MHC are the class I and class II genes, 
which encode cell surface glycoproteins that present endoge¬ 
nous and exogenous peptides to lymphocytes, respectively. 
Peptide-loaded MHC class I molecules are recognized by the 
antigen-specific receptors of CD8+ cytotoxic T lymphocytes 
and are expressed on virtually all nucleated mammalian cell 
types, with the notable exception of most forms of trophoblasts 
(Donaldson et al., 1992). Peptide-loaded MHC class II mole¬ 
cules have a more limited expression profile that is closely asso¬ 
ciated with, but not restricted to, antigen presenting cells. In the 
horse, MHC class II molecules are expressed constitutively on 
T lymphocytes (Crepaldi et al., 1986), an expression pattern 
that differs from that of humans and mice. MHC class II mole¬ 
cules are recognized by the antigen receptors of CD4+ helper T 
lymphocytes. The class III region is highly gene dense, contain¬ 
ing a number of genes that function in the immune response 
such as complement component 4 (C4) and tumor necrosis fac¬ 
tor alpha (TNF), and other genes apparently not functionally 
associated with the immune system. 
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MHC-encoded genes have been associated with susceptibili¬ 
ty to numerous infectious and non-infectious diseases in hu¬ 
mans (Dawkins et ah, 1999) making the MHC a region of con¬ 
siderable biomedical interest. However, association of diseases 
with the MHC is complex and identification of contributing 
genes is difficult due to high levels of polymorphism, linkage 
disequilibrium, and gene duplications (Apanius et ah, 1997). 
Insight into the function and evolution of the MHC can be 
gained from comparative mapping in various mammalian spe¬ 
cies. 

The genetic content of the MHC was first described in mice 
and humans, but has since been characterized in other mam¬ 
malian species and appears to be evolutionarily conserved 
(Chardon et al., 1999; Beck et al., 2001; Gunther and Walter, 
2001; McShane et al., 2001; Walter et al., 2002). However, sig¬ 
nificant differences in the number and physical organization of 
the genes within the MHC have been demonstrated in several 
species of the Artiodactyla (Smith et al., 1995; Band et al., 
1998; Chardon et al., 1999; McShane et al., 2001). 

The MHC of the horse (Equus caballus), also called ELA for 
equine lymphocyte antigen (reviewed by Marti et al., 1996; Bai¬ 
ley et al., 2000), is located on ECA20ql4->q22 (Ansari et al., 
1988; Makinen et al., 1989). Restriction fragment length poly¬ 
morphism (RFLP) analysis indicated that 20-30 class I genes 
and/or pseudo genes might be present in the ELA (Alexander et 
al., 1987), but only one complete genomic sequence and four or 
five full-length cDNA sequences of ELA class I genes have been 
reported to date (Barbis et al., 1994; Ellis et al., 1995; Carpenter 
et al., 2001). ELA class II genes have been examined for linkage 
and polymorphism at the DRA, DRB, DQA, and DQB loci 
(Szalai et al., 1994a, 1994b; Albright-Fraser et al., 1996; Fraser 
and Bailey, 1996, 1998; Hedrick et al., 1999; Horin and Matia- 
sovic, 2002), but little other information is known regarding 
content and organization of the ELA region. 

In order to better define the genomic structure of ELA, we 
used a comparative genomics approach to identify BAC clones 
containing equine MHC genes and then performed a series of 
analyses to assemble the BAC clones into contigs of defined 
gene content and chromosomal orientation. Here we report 
that the ELA appears to be similar to HLA in size, genetic con¬ 
tent, and organization. 


Materials and methods 

Design of overlapping oligonucleotide (overgo) probes 
Highly conserved regions within 24 orthologous genes (Table 1) in the 
MHC were identified by aligning available gene sequences from multiple 
species using GenBank Pairwise BLAST(r) (Tatusova and Madden, 1999). 
The sequences were analyzed by RepeatMasker for the presence of repet¬ 
itive elements (http://repeatmasker.genome.washington.edu/cgi-bin/Repeat- 
Masker). Overgo primers were designed to the conserved regions using the 
Overgo Maker program (http://www.genome.wustl.edu/tools/7overgo.html) 
and were screened against GenBank using BLAST(r) (Altschul et al., 1990) to 
confirm specificity and exclude repetitive elements. 

Overgo labeling and BAC library screening by filter hybridization 
High-density filters from the USDA CHORI 241 equine BAC library 
(http://bacpac.chori.org/equine241.htm) were probed with radiolabeled 
overgos to identify clones containing ELA sequences. The overgo primers 
were radioactively labeled using either a modification of the BACPAC 


hybridization protocol (http://www.chori.org/bacpac/) or a modified overgo 
protocol (http://genome.wustl.edu/tools/protocols/mapping/Prehyb.pdf). A 
10-pl labeling reaction containing 1 pM forward primer, 1 pM reverse prim¬ 
er, 150 Ci/mmol each of 32 P dATP and 32 P dCTP (Amersham Biosciences, 
Piscataway, NJ), 2 U Klenow fragment DNA polymerase (Roche, Indiana¬ 
polis, IN), and lx DNA Polymerase Buffer (Promega, Madison, WI) was 
incubated at 37 °C for 30 min. For fill-in labeling, 1 pi of a 250-pM dATP 
and dCTP mixture was added to each reaction and incubated at 37 °C for 
15 min (Han et al., 2000). Unincorporated nucleotides were removed using 
Sephadex G-10 gravity flow columns. The labeled overgo probes were pooled 
and added to the hybridization solution (20x SSPE, 10% SDS, 5 % milk, and 
100x Denhardt’s) containing 50% formamide, denatured by boiling for 
10 min, chilled, and then hybridized onto filters at 42 °C for 16 h. Filters 
were washed three times at 55 °C for 15 min in 2x SSPE. For some of the 
probes (see Table 1), hybridization and wash steps were done at 60 °C. After 
overnight hybridization, filters were washed for 30 min with 1 mM EDTA, 
1 % SDS, and 40 mM Na2HP04, followed by 2x washes, 20 min each with 
1.5x SSC, and 0.1% SDS. A final 20-min wash was done with 0.5x SSC, 
0.1 % SDS. The filters were exposed to film over intensifying screens for two 
days at - 80 0 C and the autoradiograms developed. 

DNA fingerprinting, Southern blot analysis and contig assembly 

Alkaline-lysis extracted DNA from MHC positive clones was finger¬ 
printed by restriction enzyme digestion with BamHl and analyzed on a 
0.65 % TBE agarose gel containing ethidium bromide. Restriction fragment 
patterns were compared to identify overlapping BAC clones, which were 
then assembled into draft contigs using a modification of the Marra et al. 
(1997) program. The gel images were digitally captured with the Alpha Inno¬ 
tech Chemilmager system and analyzed using IMAGE 3.0 software (Sulston 
et al., 1988). Preliminary contig assembly was performed using FPC V4.7.9 
(Soderlund et al., 1997). DNA fragments were Southern blotted onto nylon 
filters for subsequent hybridization with individual overgo probes to identify 
specific genes. 

DNA sequencing 

End sequencing of the BAC clones was performed on an ABI 3100 auto¬ 
mated capillary sequencer using pTARBAC 2.1 derived sequencing primers 
T7.29 (5 7 -GCCGCTAATACGACTCACTATAGGGAGAG) from (http:// 
bacpac.chori.org/cyclesere.htm) and SP6.26 (5MXGTCGACATTTAGGT- 
GACACTATAG). 

Confirmation of overlapping BACs 

PCR primers or overgo primers designed from the end sequences of 
selected BAC clones (Table 2) were used on ELA positive BAC clones to con¬ 
firm gene content and the overlaps indicated by fingerprinting. PCR was 
carried out in 25-pl reactions containing 50 ng of BAC DNA from individual 
clones as the template, 0.25 U of JumpStart Red Taq Polymerase (Sigma, St. 
Louis MO), 0.8 mM dNTPs, 0.4 pM of each primer, Master Amp PCR 
Enhancer (Epicentre, Madison, WI), and 10x reaction buffer (100 mM Tris- 
HC1, 500 mM KC1, 15 mM MgCl 2 , and 0.01 % gelatin). The thermal profile 
was as follows: 2 min at 95 °C; four cycles of 30 s at 95 °C, 30 s at 58°C 
(-1 °C/cycle), 25 s at 55 °C; 30 cycles of 30 s at 95 °C, 30 s at 54°C, 25 s at 
55 °C; 10 min at 65 °C. Overgo hybridizations were carried out on dot-blots 
of BAC clones that comprised the minimum tiling path of the contigs. The 
overgo labeling and hybridization protocols were performed as previously 
described. 

Fluorescence in situ hybridization (FISH) 

Selected BAC clones (135 M23,147 K21,163 J11,288 J19, 359L18, 372 
F10, 382 H22, 407 K07, 431 P04, 440 J07, 441 N13, 455 C07, 464 F03, and 
528 E24) were labeled with biotin and/or digoxigenin according to manufac¬ 
turer’s instructions using BioNick Labeling System (Invitrogen, Carlsbad, 
CA) and DIG-Nick Translation Mix (Roche, Indianapolis, IN), respectively, 
and hybridized to horse metaphase chromosomes individually to confirm 
their location to ECA20ql4-*q22. BAC clones were also differentially 
labeled and cohybridized to metaphase chromosomes to determine relative 
positions of clones. DNA labeling, in situ hybridization, signal detection, 
microscopy, and image analysis were carried out as described in Raudsepp et 
al. (1999) and Chowdhary et al. (2003). 
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Table 1. Overgo primers used in this study to identify BAC clones containing genes within the equine MHC region 


Gene 


Database accession number (species) 3 


Forward overgo primer b 


Reverse overgo primcr b 


ABCFl c 

ABCFl c 

APOM c 

BAT1 C 

BAT1 C 

BAT3 d 

BF C 

BF C 

BRD2 d 

BRD2 d 

BRD2 d 

BTNL2 C 

BTNL2 C 

C2 C 

C6orf9 c 

COL1 lA2 d 

Class I d 

Class I d 

Class I d 

CSNK2B C 

CSNK2B c 

CYP21A2 c 

CYP21A2 C 

DAXX C 

DAXX C 

DMA C 

DMB d 

DMB C 

DOA c 

DOB d 

DPA C 

DQA d 

DQA d 

DQB d 

DQB d 

DRA d 

DRA d 

DRB d 

DRB d 

FLOTl d 

GNLl d 

GTF2H4 C 

GTF2H4 C 

HSPAlA d 

HSPAlA d 

HSPAlB d 

LY6G6C C 

MICA C 

MICA C 

MOG d 

MSH5 C 

NRM C 

PBX2 d 

POU5Fl d 

POU5Fl c 

POU5Fl c 

PPP1R10 C 

PPP1R11° 

PPP1R1 l c 

PSMB8 d 

RDBP C 

RNF5 d 

RNF5 d 

RXRB d 

TAP2 C 

TAP2 C 

TCF19 d 

TNF d 

TNF d 

TRIM10 d 

TRIM26 d 

TRIM26 C 

TRIM26 C 

TUBB C 

UM01 l c 


10947134 (hs), 28499835 (m), 10863746 (r) 

10947134 (hs), 28499835 (m), 10863746 (r) 

CC875245 (h) 

19913439 (hs), 9790068 (m), 6016839 (r), 509402 (p) 
19913439 (hs), 9790068 (m), 6016839 (r), 509402 (p) 
13111924 (hs), 6114859 (m), us (c) 

14550403 (hs), 975231 (p) 

14550403 (hs), 975231 (p) 

1240864 l(hs), us (c) 

12408641 (hs), 6753909 (m) 

12408641 (hs), 6753909 (m) 

9624968 (hs) 

9624968 (hs) 

CC875225 (h) 

31389770 (h) 

1480743(h) 

435020 (h) 

435020 (h) 

728545 (h) 

26787971 (hs), 7106276 (m), 13591929 (r) 

26787971 (hs), 7106276 (m), 13591929 (r) 

CC875248 (h) 

CC875248 (h) 

3868937 (hs), 6681134 (m), 18266689 (r) 

4503256, (hs), 2253706 (m), 18148938 (r) 

18765714 (hs), 6754119 (m) 

6636790(h) 

6636790(h) 

4504400 (hs), 6680150 (m) 

32271 (hs), 561974 (m), 5901566 (c) 

24797073 (hs), 20899739 (m) 

2226318(h) 

530126 (h) 

530128 (h) 

164203 (h) 

164236(h) 

976200(h) 

1228844 (h) 
us (h) 

6031143 (hs), 2149603 (m) 

807999 (hs), 311933 (m), us (c) 

27498326 (hs), 6754093 (m) 

27498326 (hs), 6754093 (m) 

26787973 (hs), 14010866 (r) 

26787973 (hs), 14010866 (r) 

26787974 (hs), 497937 (c) 

CC875223 (h) 

18033157 (c), 6624722 (p) 

18033157 (c), 6624722 (p) 

13645153 (hs), 548189 (m), us (c) 

CC875243 (h) 

CC875241) 

35312 (hs), 2432012 (m), us (c) 

17464418 (hs), 200117 (m), 4103379 (c) 

22048457 (hs), 7305398 (m), 4103379 (c), 6624722 (p) 
22048457 (hs), 7305398 (m), 4103380 (c), 6624722 (p) 
CC875249 (h) 

11386174 (hs), 18390326 (m) 

11386174 (hs), 18390326 (m 
38480 (hs), 405775 (m), 3242945 (c) 

CC875211 (h) 

5902053 (hs), 9507058 (m) 

13026452 (hs), 9507058 (m) 

30447 (hs), 3811374 (m), us (c) 

16610227 (hs), 19171651 (c) 

16610227 (hs), 19171651 (c) 

833832 (hs), us (c) 

6577092(h) 

164244 (h) 

12407412 (hs), 4731627 (m), us (c) 

16445440 (hs), 12275879 (m), us (c) 

5926704 (hs), 6625535 (p) 

5926704 (hs), 6625535 (p) 
us (h) 

6180135 (h) 


GCCTCATCACAGAAACCAACTGCC 

TTTGACCTTGAGATGCAGAATCGG 

CTAGTATCCCCACTCACTGTAGAG 

T GGC AGAGAAC GAT GT GGACAATG 

GAAGC AGGTC ATG AT GTT C AGTGC 

CTGTTCATGACCGGAATGCCAACA 

TGTCCTTCTGGCTTCTACCCGTAC 

CTTTATCTTGGGCCTCTTGTCTGG 

CCC AT G AGTT ACG ATG AG AAGCGG 

ATCAGTTTGCATGGCCATTCCGGC 

AGTGTTACCAGTGCCCATCAGGTG 

CAGGCTACAATCTGTCTGGTGCAG 

GCATCGCATCCAAGATAAAGATGG 

AATCCATGACTCCTGCATGGCATG 

T CCTGG AGTT GCT GCT GAG AGTT C 

CGCAAGAACCCTGCTCGCACCTGC 

GCTTCATCACCGTCGGCTACGTGG 

CCCAACACTGACCTTTGTGCTTTC 

T GGT GCCTTCT GGGG AGG AGC AG A 

TCGGCTTTTTGCGCTGTAGTGGTC 

GGTGAAACTCTACTGCCCCAAGTG 

GCTTCTAGGGAATTCTCTTTCCTC 

ATTCTCTCTCCTCACCTGCAGCAT 

GAAGAGTTCCTTGAACTGTGTAAG 

CACTGTATGTGGCAGAGATTCGGC 

TCACGCTGAAGCCCCTGGAGTTTG 

TT AGC A ACTTGGGGG AGCT C ATT C 

CGCAAACTCTCAGTGGATGCTGCT 

CCCCCTGTGATCAATATCACCTGG 

GCAAAGGCTGACTGTTACTTCACC 

GCTGGGCCAGCCCAACACCCTCAT 

ACT G AG AG AAGT GGCT ACGGC A AA 

AACACCCTCATCTGTCTTGTGGAC 

ATCTCCCCATCCAAGACAGAGGTT 

AC ATT GCCG AGT ACT GG AACGG AC 

GACCAACTTTTCCGCAAGTTCCAC 

G AG ATTTTTC ACGT GG AT AT GG AC 

ACGCCGAGTACTGGAACGGGCAGA 

TGCTTGAGACAGTTCCTCAGAGTG 

T GTTTTT C ACTT GT GGCCC A A AT G 

AGCCT GGGC AG AG A A ACGT GGTT A 

GAACCGAGTACACCTACAATGCAG 

GGTGTCCTGTATAACCAGTTCCTG 

TGTTCCAGCACGGCAAGGTGGAGA 

ATCCCCAAGGTGCAGAAGCTGCTG 

TGTCCATCCTGACGATCGACGACG 

CACTTGCAGAAATGGCTTGGCACT 

TTTTGCTGAGGGACACTTGGATGG 

AGTCTGGGGATGTCCTGCCTGATG 

C AGG ATCCGG A AT GT G AGGTT CTC 

ATTCATGGTTCTGGCCCCACCTCT 

AGGCCCAAATCCATGAACTGAAGG 

CATCGAACACTCGGACTATCGCAG 

CGTGAGGATTTTGAGGCTGCTGGG 

CTGGGTTGATCCTCGGACCTGGCT 

AAAGCAGAGACCCTGGTGCAGGCC 

TAGGGATTCTCCTGCACATGCTAC 

CGGACGCTTACTATGAAACTTCGG 

AGGATGAAGAAGAGGGCTGTGGT C 

GTGATTGAGATTAACCCTTACCTG 

AGCCTCTGGTATACCATGGTTTGG 

TCAGTGTGTGTGGCCACCTGTACT 

GCCAGAAGCCCCAGGATCCCAGAT 

CTGGCCCCCCT G A AG AT GT GAAGC 

AGGCTTCTTCACCTTCACCATGTC 

CATCCTGGATGAGGCTACTAGTGC 

ACTTT GGT C A AT A AT GTCCG ACT C 

TACCTCATCTACTCCCAGGTCCTC 

AAGCCGAGGGGCACCTCCAGTGGC 

AGT GTCTT G AGT GT CT A AG A A A AG 

A AT CCT G A ACC ACCT G AGT ACCCT 

AAGAGGAGGTGACCTGCTCCATCT 

AAAGCTGCATGGTGGC GGTGGCTA 

ATCCAGGAGCTGTTCAAGCGCATC 

TTGGTGAGGATTAGGGGTTTTCCC 


TCCACCACCCACACCTGGCAGTTG 

AGAACTTCTGTGTGGGCCGATTCT 

TTACCAGCGCTTCCTCCTCTACAG 

TCATAGTCCAAGAGCTCATTGTCC 

CTCTTTGCTCAAGGTAGCACTGAA 

CCAACCATGACATAGCTGTTGGCA 

TACGAATCTGCACAGGGTACGGGT 

AGTCATGCTCACACCTCCAGACAA 

TGTCCAGGCTCAACTGCCGCTTCT 

ACAGCATCCACAGGCTGCCGGAAT 

CAGAAGAGACAGCAGGCACCTGAT 

AATAGGAAGGAGGCGACTGCACCA 

TTCCGCATAGAACAGGCCATCTTT 

CACCCCAGATTGTATGCATGCCAT 

ATTCGACCCCCACCATGAACTCTC 

AGAGTTTCAGGTCCCGGCAGGTGC 

ACGAACTGCGTGTCGTCCACGTAG 

AGGACATTAGATCAGGGAAAGCAC 

ACATGGCACGTGTATCTCTGCTCC 

TCCAAGGAACCGCAGAGACCACTA 

TGTGTACACGTCCATGCACTTGGG 

AGATGATGCTGCAGGTGAGGAAAG 

GGTGAGGTAACAGATGATGCTGCA 

GGTCTGATGTCTGCATCTTACACA 

TCCTTTTCCTGCAGCCGCCGAATC 

AAAGTGTTGGGCTTGCCAAACTCC 

TACAGAATGTTCCCAGGAATGAGC 

ATGTGGGAGGGGCCTCAGCAGCAT 

TTTGGCCGTTGCGCAGCCAGGTGA 

CCTTTT CT GTCCC ATT GGT G AAGT 

CTT GTC A AT GTGGC AG AT G AGGGT 

G ACTTCC A AGTT GT GTTTTGCCGT 

CAGGAGGGAAGATGTTGTCCACAA 

GGTTGTGGTGGTTTAGAACCTCTG 

TCCAGGACGTCCTTCTGTCCGTTC 

GCAAGAAGGGGAGATAGTGGAACT 

AG ACCGT CT CCTT CTT GTCC AT AT 

TCATCCAGGACGTCCTTCTGCCCG 

CAGGTGTAGACCTCTCCACTCTGA 

GAGACCACCATGGCCTCATTTGGG 

GGCCTT GGCT GTCTT GT AACC ACG 

GAACTCCTGCAGATTCCTGCATTG 

CAAAGTCCACTTGCGACAGGAACT 

TGGTCGTTGGCGATGATCTCCACC 

CGTTGAAGAAGTCCTGCAGCAGCT 

TTCACCTCGAAGATGCCGTCGTCG 

TTTCTTGGCAGCCTCAAGTGCCAA 

GCGCAGGAAGGCCTGACCATCCAA 

TGGTAGGTCCCATTCCCATCAGGC 

AAAACCTCCTTCATCTGAGAACCT 

TCTCAAGCTTCTCCAGAGAGGTGG 

AATCAGAGGGACAGCCCCTTCAGT 

GAT CTGGGC AAGTTT GCTGCG AT A 

CCCCTGCGAAAGGAGACCCAGCAG 

AGGCCCTTGGAAGCTTAGCCAGGT 

TTCGCTTTCTCTTCCGGGCCTGCA 

AAACTCAAGGGACCAGGTAGCATG 

TCTCTGGCTTCCGTTTCCGAAGTT 

CGTACACAGTGCGTATGACCACAG 

CAGACATGGTGCCAAGCAGGTAAG 

GGGAAACCCAATCTATCCAAACCA 

TGAAGACATGGCCAACAGTACAGG 

CGGGGTGGAGTTTTTAATCTGGGA 

ACCCCTAAGACTGGTGGCTTCACA 

CCGCAAGTTGATTCGAGACATGGT 

ACACTCCACATCCAGGGCACTAGT 

GCCTGTGACCCCTTGGGAGTCGGA 

AGCCTT GGCCTTT G AAG AGG ACCT 

TTTGCACGCCCACTCAGCCACTGG 

T G AAT CTCCTCTCTCT CTTTT CTT 

GT CT CT GT CTCTCCTT AGGGT ACT 

CGCAGGTAATCAAGACAGATGGAG 

CTCTTCACAGAGTCTCTAGCCACC 

CTGTGAATTGCTCCGAGATGCGCT 

CAGTTGGGAGGGAAATGGGAAAAC 


a Species arc abbreviated as followed: (c) cow, (h) horse, (hs) human, (m) mouse, (p) pig, (r) rat. Unpublished sequence is abbreviated as us. 
b Overgo primers for these genes were used in the initial screen of the CHORI 241 BAC library. 

Overgo primers hybridized using the 60°C protocol arc in bold font. The other overgo primers listed were used to identify the specific gene content of BAC clones isolated 
in the initial screen. 
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Table 2. BAC end sequence-specific PCR and overgo primer pairs used for BAC overlap confirmation 


Accession no. Primer set Type Forward Primer 


CC875238 

163Jll T7 

P 

CC875237 

163 Jl l SP6 

P 

CC875239 

359 L18T7 

P 

CC875220 

147 K21 T7 

P 

CC875240 

147 K21 SP6 b 

P 

CC875224 

372 F10T7 

P 

CC875222 

372 F10SP6 

P 

CC875228 

431 P04 T7 

P 

CC875250 

431 P04 SP6 

P 

CC875223 

497G11 T7 

P 

CC875223 

497G11 T7 

O 

CC875221 

497 Gil SP6 

P 

CC875221 

497 Gil SP6 

0 

CC875244 

394D17T7 

P 

CC875243 

394 D17 SP6 

P 

CC875246 

024 Al3 T7 b 

P 

CC875246 

024 A13T7 

0 

CC875242 

464 F03 T7 

p 

CC875247 

464 F03 SP6 

p 

CC875230 

475 L06 T7 

p 

CC875227 

475 L06 SP6 

p 

CC875231 

499 LI 1 T7 

p 

CC875229 

499 LI 1 SP6 

p 

CC875236 

355 H03 T7 C 

p 

CC875236 

355 H03 T7 

0 

CC875235 

355 H03 SP6 

p 

CC875235 

355 H03 SP6 

0 

CC875234 

102 G05 T7 

p 

CC875251 

102 G05 SP6 

p 

CC875233 

382 H22 T7 

p 

CC875232 

382 H22 SP6 

p 

CC875252 

528 E24 T7 

p 

CC875226 

528 E24 SP6 d 

p 


GT C AC ATTT GA A AGC ACG A AA 
C A AT GG AA AAGCT CCC ACT C 
C AG AGTT CC AAAGAGCC AGG 
GAACC AAT C AG AAAT GTT GGAG 
AGCCT C AGTTCT GGG AAA AG 
T C AAG AGGCTT AC A AT CT GAG A A A 
ATCTAGCCACGGGCCACTT 
ACCGGCCTGGGCTCTCAT 
T GGG AGT GG A AGT GT ATTT C A 
AG A AAGGC ACCT GT GGT C AG 
C ACTT GC AG A AAT GGCTT GGC ACT 
T GCT GGCT CT G AAG A AAC A A 
AGCCT CTGGT AT ACC AT GGTTT GG 
CG AT ACT ACGG AGG AC AG AAT 
T GAAGT GGAT GGT GGAGT CA 
AGT GAATGAGAAGACAAGCCA 
TTCCC AT GG AT GT GCTT GGGG AT G 
GCGATTGTTGGTCTGCTGTA 
TACTGCCCACTCGGACTTCT 
ACAGCACAGGATACAGCACATC 
CAT G AGT GT CCG ATT AGGCTT AC 
ACAGTAAAATAGGGTGGAGGC 
CCT GACTT CT GCT GTT GACT C 
ACCTCATTCGTGAAACTCTCC 
TTT AT CCCCT G A ACCC ACT GT GT C 
T GTTT GCT AAAT G A AT GTGG A 
CTT CCTCTGTCT CAT CC AC A ATT C 
CTGCCACAACCCAAATGAG 
CT AGGT GAGTAGCGT ATGCGG 
GAG ACT AG A AT G AGG AT A AAGCT G 
TCACTCTATCCCCACCATCTG 
CAAAACAGACCCAAGCCATT 
ATCCCCAAGGACCAATCC 


Reverse Primer 


CCG ACCCC AT GT CAT A A A A 
C ACT GGCGGG AT GG ATT AT 
TAGCACAGACGCATCGCAG 
AGTTCTCGCTCCTCCTCTATCT 
C AGGGT ATT GC AG AG AGC AG 
AGT CCC A A AACTT GC ATTCC 
A ACCC AGGT GT CCA AGT C AG 
GCGCTACCAGGATCTCTCACTA 
GTCACTAGACCACAAACCCCT 
G AATT G AGGC AGG AATTT GG 
TTT CTT GGC AGCCT C AAGT GCC AA 
GC ACT GAT GGCTCCT G ACTT 
GGG AA ACCC AAT CT AT CC AA ACC A 
GC ATTT CCTAATT GAGTT GG 
GGGTG AT GAT GCCTTTCT C A 
GGC AGGTTTTT GT GT GG A 
ACTGGAGCTCCAACCCCATCCCCA 
TCCATAAGTGTGGCAAGCTG 
TATCTGCGCCTCAAGGAGAT 
T AT ATT AGGTT GGGGGT GGT C A 
C ATTCTT ACC AG AGG AGGAT GG 
TGGCCTTGATTCCTCTCTGTA 
GGAACCTCAATCTTGTCTGGT 
C ACCT CT GGGAAC AGCTTATT 
GT GT GGGGT GAG ATTT G AC AC AGT 
CTCTCT GAT ACT GT G AA AACT GG 
GGG AGGG AATTT GG A AG A ATT GT G 
AGCTTCCGTGCTCCTCTAGT 
T GT CCC A AG ACC A A A AGT GT A 
AGCTAAACCATGTTCTTCTCC 
TT GT GC A AAGCC A AG AT CC 
CAGTGGAAAGGGAGAGGGTA 
GC AC AG AG AC AAGAT GACCCT A 


The type of primer is abbreviated as (P) for PCR primers and (O) for overgo primers. 
b Indicates the annealing temperature and conditions are TD 58-57°C, 56°C for 30 cycles. 
c Indicates the annealing temperature is 50°C for 30 cycles. 

d Indicates the annealing temperature and conditions are TD 60-51°C, 50°C for 30 cycles. 


Results 

Identification of BAC clones containing horse MHC region 
genes 

The horse CHORI 241 BAC library has an 11.8-fold 
genomic representation and contains over 190,000 recombi¬ 
nant clones, with an average insert size of 170 kb per clone 
(http://bacpac.chori.org/equine241.htm). Initial screening of 
the library for 24 MHC genes yielded 504 positive BAC clones. 
The gene content of the isolated BAC clones was determined by 
hybridization of individual overgo probes to secondary filters 
containing the 504 positive clones. Two hundred thirty-eight 
clones were assembled into two contigs. Seventy-four of the 238 
BAC clones were identified for the class II region as follows: 
seven with RXRB, BRD2, PSMB8, and DMB, one with BRD2, 
PSMB8, DMB, and DOB, and 67 with DOB, DRB, DQB, 
DQA, and DRA. Fifty-four BAC clones were determined to 
contain the following class III genes: six with PBX2, 15 with 
RNF5, 24 with HSPA1A and HSPA1B, and nine with BAT3 
and TNF. One hundred ten BAC clones were identified for the 
class I region as follows: 79 with class I sequence, 12 with 


POU5F1 and TCF19, three with FLOT1, nine with GNL1, one 
with TRIM26 and TRIM 10, and six with MOG. 

Fingerprint analysis, Southern blot hybridization, and 

contig assembly 

BamHl digestion of DNA isolated from 103 BAC clones 
selected based on gene content, fingerprinting, and Southern blot 
analysis resulted in the construction of seven provisional contigs. 
Analysis of BAC end sequences by GenBank BLAST (Altschul et 
al., 1990) determined that a majority of the end sequences con¬ 
tained non-coding sequences and/or repetitive elements and 
were not informative for determining gene content. However, 
ten MHC genes were identified in the BAC end sequences: 
TAP2, CYP21A2, RDBP, C2, LY6G6C, MSH5, APOM, NRM, 
PPP1R10, and MOG. End sequence-specific PCR or overgo 
hybridization, combined with previous data, resulted in the 
assembly of two contigs: one with a minimum tiling path com¬ 
prised of nine BAC clones for the class II region and a second 
with a minimum tiling path comprised of 14 BAC clones for the 
class III and I regions. A single gap between the class II and class 
III regions prevented the joining of these two contigs. 
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Fig-1 . Bacterial artificial chromosome (BAC) contig and gene map of the equine major histocompatibility complex (ELA). 
Genes are shown in sequential order, as determined by overgo hybridization and end sequencing. MHC class III genes are shown 
in green, MHC class II genes are shown in red, and MHC class I genes are shown in blue. Genes are spaced equally because 
distance between genes has not been determined. The blue block indicates a gap between the MHC class II and MHC class III 
regions and between class I and the centromere. 


Characterization ofcontigs of the horse MHC region 

Restriction fragment length analysis of BAC clones within 
the class II contig gave an estimate of approximately 800 kb of 
DNA in the class II contig. The order of genes in this region 
from the telomeric end is presented in Fig. 1. Southern blot 
analysis indicated the presence of at least one DRA locus, two 
DQA and DQB loci, and three DRB loci. However, the exact 
locations of these loci could not be precisely determined in this 
study. 

The class III region was estimated to contain approximately 
430 kb of DNA as deduced by restriction fragment analysis of 
the four BAC clones spanning the minimum tiling path of the 
class III portion of the second contig (Fig. 1). End sequence 
analysis confirmed the presence of the C2, LY6G6C, APOM, 
RDBP, MSH5, and CYP21A2 genes in this region. 

BAC clones encompassing the MHC class I region con¬ 
tained about 1.2 Mb of DNA, extending from MICA to 
GABBR1. Fifteen anchor genes were identified in this region, 
three of which were confirmed by end sequences (NRM, 
PPP1R10, and MOG). The specific gene order for the class I 
region is presented in Fig. 1. Three regions of the second contig 
contained class I sequences. The regions located near MICA 
and GNL1 each appear to contain multiple class I genes. A 
third region, located near TRIM26, appears to contain a single 
class I gene. Characterization of the class I genes contained 
within these regions has been initiated to determine how the 
distribution of ELA class I genes relates to the distribution of 
class I sequences in other species. 



Fig. 2. (A) Dual-color metaphase FISH of fluorescently labeled DNA 
from BAC clones showing chromosomal orientation of the horse MHC on 
ECA20q21. Class I MHC BAC 528 E24 shows red signal, and the class II 
MHC BAC 147 K21 a green signal. (B) Dual color FISH to equine interphase 
nuclei demonstrating that the equine MHC class III region (BAC 431 P04 
labeled green) is located between the class I (BAC 528 E24 labeled red) and II 
(BAC 147 K21 labeled red) regions. 


FISH 

FISH was used to anchor each BAC contig on ECA20 and to 
identify and orient the different MHC regions relative to the 
centromere. FISH was performed using 14 BAC clones (three 
class I, nine class II, and two class III) strategically selected 
from different regions of the two BAC contigs. All BAC clones 
mapped to ECA20q21 confirming the previous localization of 
the horse MHC by in situ hybridization (Ansari et al., 1988; 
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Makinen et al., 1989). Dual color FISH using representative 
clones from the class I and class II regions showed the class I 
region to be proximal to the class II region (Fig. 2A). To con¬ 
firm the location of the class III region, dual color interphase 
FISH was performed. The results indicated that the class III 
region is located between the class I and class II regions 
(Fig. 2B) as reported for other species. 

Discussion 

The two BAC contigs described herein provide the first 
comprehensive physical map of the horse MHC. The class II 
region contig spans approximately 800 kb, while the class III - 
class I contig spans approximately 1.6 Mb. The gap between the 
equine MHC class II and class III regions is estimated to be 
about 170 kb based on the distance between BTNL2 (distal 
marker of class II region) and NOTCH4 (proximal marker of 
class III region) in the human MHC. Assuming this estimate to 
be correct, the overall size of the equine MHC is estimated at 
2.6 Mb of DNA, a figure smaller than the 3.6 Mb estimated for 
humans (The MHC Sequencing Consortium, 1999), but slight¬ 
ly larger than for the pig MHC (Chardon et al., 1999). 

Studies of domesticated species including cow (Lewin et al., 
1999) and pig (Chardon et al., 1999) have reported multiple loci 
for many of the D region genes including DQA, DQB, and DRB, 
and the horse appears to share this phenomenon. Southern blot 
analysis of MHC class II BAC clones and independent assign¬ 
ment of class II genes to non-overlapping BAC clones by filter 
hybridization confirmed the existence of at least two ELA-DQB 
loci, identified earlier by Horin and Matiasovic (2002). The 
same techniques provided evidence for the presence of at least 
one locus for ELA-DRA and for ELA-DQA, supporting pre¬ 
viously published results (Albright-Fraser et al., 1996; Fraser 
and Bailey, 1996, 1998). Additionally, at least three ELA-DRB 
loci were identified that had been previously demonstrated in 
other domesticated equids (Fraser and Bailey, 1996). However, 
recent studies found that Andalusian stallions and Przewalski’s 
horses appear to have at most two DRB genes (Hedrick et al., 
1999). It is not yet known if the number of DRB loci varies 
between equine MHC haplotypes, as has been found in other 
species (Gongora et al., 1997; Ellis and Ballingall, 1999). Overall, 
our results corroborate and extend the findings of previously 
published characterizations of equine MHC class II genes. 

The gene order within the ELA class III region appears to be 
well conserved with class III regions of human, mouse, and pig, 
which is consistent with the suggestion that this region of the 
mammalian genome is evolutionarily more conserved than the 
other regions of the MHC (Peelman et al., 1996). This conser¬ 
vation is evidence of a potential functional significance related 
to the organization of these genes within the MHC region. Fin¬ 
gerprinting analysis, however, indicates that the size of the class 
III region in the horse is smaller relative to that of human and 
mouse (Kumnovics et al., 2003). 

Of the 15 genes localized to the equine MHC class I region, 
seven genes (POU5F1, TCF19, GNL1, TRIM26, TRIM10, 
PPP1R11, and MOG) were previously identified as highly con¬ 
served framework genes in human and mouse (Amadou, 1999). 


The framework hypothesis predicts that functionally important 
genes within the class I region form a scaffold of framework 
genes. Within the framework scaffold are regions referred to as 
duplication blocks, where class I genes have been inserted, 
duplicated, and expanded. Three major duplication blocks des¬ 
ignated alpha (between PPP1R11 and MOG), beta (between 
BAT1 and POU5F1), and kappa (between GNL1 and 
TRIM26) have been identified in a number of species. How¬ 
ever, the number of class I genes found within these regions 
varies greatly between species (Kulski et al., 2002). In this 
study, the ELA regions shown to contain class I genes coincide 
with the beta and kappa blocks, but no clear evidence of class I 
genes in the alpha block was found. This arrangement is similar 
to that reported in pigs (Velten et al., 1999). Detailed analysis 
of the ELA class I region will be necessary to confirm the frame¬ 
work organization suggested in this study, but the ELA data to 
date appear to support the conservation of framework genes 
and duplication blocks across species and the idea that the 
genomic organization of the class I region has functional and 
evolutionary significance. 

The gene order and organizational features of the horse MHC 
described here are in general agreement with that described for 
the primates (Leelayuwat et al., 1993; The MHC Sequencing 
Consortium, 1999) and carnivores (Beck et al., 2001; Wagner, 
2003), although ELA appears to be reduced in size. This observa¬ 
tion implies that the disruptions of the MHCs seen in pigs and 
ruminants occurred after divergence of the Artiodactyls in the 
mammalian lineage. In swine the MHC is disrupted by a cen¬ 
tromere, and in ruminants, by a large chromosomal inversion 
(Smith et al., 1995; Skow et al., 1996; Band et al., 1998; McShane 
et al., 2001). Interestingly, characterization of the chicken MHC 
also shows a disruption (Kaufman et al., 1999) that is not 
observed in the passerine birds (Kaufman et al., 1999; Shiina et 
al., 1999). The genomic structure of the horse, human, and 
mouse class II and III regions seems to be highly conserved 
(Amadou et al., 1999; The MHC Sequencing Consortium, 1999; 
Walter et al., 2002). This region may have existed in a primor¬ 
dial ancestor in a similar organization and was not subjected to 
major rearrangement during evolution in these species. 

In conclusion, comparative analysis of the equine MHC 
region has demonstrated significant conservation of gene order 
and genomic structure relative to other mammalian species, 
except the Artiodactyla. The construction of two BAC contigs 
has increased our knowledge of the gene content and organiza¬ 
tion of the MHC region of the horse and provided the 
sequence-ready templates required for detailed analysis of this 
important region of the genome. We anticipate that further 
characterization of ELA will provide valuable information on 
the functional genomics and evolution of the MHC of the horse 
and contribute to our understanding of the relationship be¬ 
tween the genes of the equine MHC and the equine immune 
response. 
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Abstract. The MHC class II DQB gene of horse was isolated 
and characterized. No obvious mutations causing frame shifts, 
or destruction of putative protein structure and splicing ma¬ 
chinery were detected. Nucleotide sequence of exon 2 was con¬ 
sistent with an allelic sequence of the W23 haplotype. The cyto¬ 
plasmic region of the equine DQB gene comprised two exons 
and an intron. A novel fragment of the gene was identified at 
the y intergenic region proximal to the ELA-DQB gene by 


sequence comparison between the human and horse DQB 
genes. This sequence showed the highest identity to exon 3 
region of the DQB gene, however the 5 7 half of this exon was 
truncated as compared with the intact exon. This gene frag¬ 
ment was also identified in the same site of the HLA-DQB 
gene. 

Copyright©2003 S. Karger AG, Basel 


Major histocompatibility complex (MHC) class II antigens 
are highly polymorphic cell-surface proteins involved in the ini¬ 
tiation and regulation of the immune responses (Babbit et al., 
1985, 1986; Buus et al., 1986). Allelic sequence variation of the 
class II genes predominantly affects the structure of the first 
external domains of the a and (3 chains which present the for¬ 
eign and self peptides generated by enzymatic modifications 
(Vogt and Kropshofer, 1999). Class II genes have thus been 
considered to be candidate genes for autoimmune diseases 
(Gebe et al., 2002; Pheps et al., 2000). 

From phylogenetic analysis of immune systems, class II 
genes have been discovered in lower vertebrates such as carti¬ 
laginous fish (Kasahara et al., 1992) and amphibians (Sato et 
al., 1993). These observations were considered to be evidence 
for the establishment of vertebrate antigen presentation path¬ 
ways long before the emergence of mammalian clades. The 
driving forces of conservation and extensive polymorphism of 
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the genes have been assumed in primary by over dominant 
selection (Hughes and Nei, 1988; Nei et al., 1997). 

Horse MHC system, also known as Equine Lymphocyte 
Antigen (ELA), has been described as the serological determi¬ 
nant using pregnant mare sera and alloantisera (Bull et al., 
1983). To date thirteen alleles of class I A locus have been rec¬ 
ognized internationally (Lazary et al., 1988). Further, six sero¬ 
logical determinants of class II region have been interpreted by 
allelic determinants of DQB gene (Szalai et al., 1993). In this 
study the class II DQB gene of the horse is characterized to 
clarify it’s evolutionary consequence and functional differenti¬ 
ation in horses. 


Materials and methods 

A horse genomic library was constructed using XGeml2 vector as 
described previously. (Frischauf et al., 1983). The libraries were screened by 
the probe listed below using conventional lab protocols. Hindlll, Sail and 
EcoRl fragments were cloned into pUC 118 vector for sequencing. Two oligo¬ 
nucleotide probes for (32 sheet (MZ1) and cytoplasmic tail (MZ2) were syn¬ 
thesized in our facility. Nucleotide sequences were as follows; MZ1: 5-CGG 


Fig. 1. Nucleotide sequence of the ELA-DQB gene, (a) The exon 2 
sequence is consistent with a nucleotide sequence previously reported as an 
allele of DQB (W23: L08747.1). (b) Nucleotide sequence of the interval cov¬ 
ering from exon 3 to exon 6. Amino acid sequence of the gene was aligned 
with the codon represented with bold letters. 
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a 

gatcccagggcttgcccttcgagacagggctctcatttccttgtataaaggcaattctggaagccctcacagggagacgggtaggcctggggaagagctg 

gctgagaatcccggggtcagagcgggaggcgagtggggcggggacccgaggtcgcggccgggtttttaggtttatctcacccaactggccgcgacccaga 


■k k k k k k 


exon 2 


k k k k k k 


acgtccacccagcacaaatggtgctgcgttgggctgcggggctgcggcctgactcatgggcggcgattccccgcagaggatttcgtgatccagtttaagg 

AspPheVallleGlnPheLysA 

cccagtgctacttcaccaatgggacggagcgggtgcggctcgtgaccagactcatctataacctggaggagtacgcgcgcttcgacagcgacgtgggggt 

laGlnCysTyrPheThrAsnGlyThrGluArgValArgLeuValThrArgLeuIleTyrAsnLeuGluGluTyrAlaArgPheAspSerAspValGlyVa 

gtaccaggcggtgaccgagctggggcggccgtgcaccgagtactggaacgggcagaaggacgaactggaacgggtgcgggecggggtggaccgtgtgtgc 

ITyrGlnAlaValThrGluLeuGlyArgProCysThrGluTyrTrpAsnGlyGlnLysAspGluLeuGluArgValArgAlaGlyValAspArgValCys 

agacacaactacaagttggaggtccccaggtccttgcagcaccgaggtgagcgccggtcatccgccctccgcagggcccgccctccgcagggtccacctg 

ArgHisAsnTyrLysLeuGluValProArgSerLeuGlnHisArg 


gccgccgagtctctgcgccaggagagcttggggagcggcggtctgtg 650 


k k k k k k 


k k k k k k 


exon 3 

ccccgggtaccaaggaggagtctgcccgtgtggagacttgctgtgtggtttcacatctcactgtcttttcctcccctcagTGGAACCTACGGTGACTGTC 

alGluProThrValThrVal 


TCCCCATCCAGGACAGAGGCTCTAAACCACCACAACCTGCTGGTCTGCTCAGTGACAGATTTCTATCCAGGCCGGATCAAAGTTCGATGGTTCCGGAATG 

SerProSerArgThrGluAlaLeuAsnHisHisAsnLeuLeuValCysSerValThrAspPheTyrProGlyArglleLysValArgTrpPheArgAsnA 

ACCAGGAGGAGACAGCCGGTGTTGTGTCCACCCCCCTTATTAGGAACGGGGACTGGACCTTCCAGATCCTTGTGATGCTGGAAATGACTCCCCAGCGAGG 

spGlnGluGluThrAlaGlyValValSerThrProLeuIleArgAsnGlyAspTrpThrPheGlnlleLeuValMETLeuGluMETThrProGlnArgGl 

AGATGTCTACACCTGCCACGTGGAGCATCCCAGCCTCCAGAGCCCCATCACAGTGCAGTGGCgtaagggacaatttgtttcctttcactgtgggccccac 
yAspValTyrThrCysHisValGluHisProSerLeuGlnSerProIleThrValGlnTrpA 


aagacagagggcagagcttcctctggcccatcccatctcatctcttatccttgacttcactactgagctggaaatcaaggagactagagtgcctcttgtc 
ccataggaagggcatcagaagaatcctgatcgcattgtctctccagatactaggaggtcagttaacacaccacggccccagaacccagccttgatgactc 
tgaaggattgactattatgactggtgactggggtcttagggtctcagattatggatgttttcctgaggagcagggatccgcttcctcccctttctctcac 
ccacccactgtgtccaaggatctattggctggtccctcccccaagagtggccagaatggagacctagttcccctggaacctctacctcctgtatctcaga 


k k k k k k 


k k k k k k 


exon 4 

caggacttcatgcttcccaaaggatcactgtggcgtactgggacaaatgctgacactcaggctctgatccccagGGGCGCAGTCTGAATCTGCCCAGAGC 

rgAlaGlnSerGluSerAlaGlnSer 


AAGATGCTGAGTGGTATTGGGGGTTTCGTGCTGGGGCTGATGTTCCTCGGACTGGGCCTTGTCATCCGTCACAGGAGCAAGAAGGgtaaggcactgtggg 

LysMETLeuSerGlylleGlyGlyPheValLeuGlyLeuMETPheLeuGlyLeuGlyLeuVallleArgHisArgSerLysLysG 


gaaatggggaagatgagctgtgactgagaccctctgttcaggggtcctctgcctccagtgtaaatccttcctcctgaccctaaaaggcaaaaacetgggc 
tggtggtgggaggagccetagggggagatgctggaatctggtaacaggtggaatgtattctaggacttccttcagttcatcagacctcactggctccttt 
ccgaaagcttcctcctttaagagggtcagagcataggctttccttccttctagtgagtgtttcattcattttggaggattttagcttggggcagttaaga 
cctggaggctgatgggtaaggaggaaataactttccatttaagttgcatgtctcatttccctttggggtgagtgagggactggatgtttgtttagtgaga 


k k k k k k 


k k k k k k 


exon 5 

cctttctctgtgtaacttcctttgtagGACCTCGTGGGACTCCACCAGCAGgtactatttctgcccggattcattttggggtgcggggacaggtaaaaga 

lyProArgGlyThrProProAlaG 


ggaagggctgagctgagtgtccctgggcacaatggtctcagttcatggcctattccctgctatgggggtcaaggttaggggagaaggttgcccagtttc 
gtaggaagctccgaggtttgcccccagaaccagggcataattttggtgacatctttctgtgaaacttggagccagacccacctcttgggtattagacac 
GGQaggatgcccactttgtgtcacatgttggtgactactgcctgtgtgcatttgtaagtggtggaacggtgggttatctaatttactaaaaagaattaa 
tcttcatattccccagagggataacagctgcccccccgcctcccacgcatctgcgtggagctgaaatgccatgtcctctttagctgatttcacttttac 
cagatattggggaagatgatgatgctatgccctggacctcagctttctctatctgatgctgcaggggcctcgaggggagaggagaaggtgcatttctca 


****** exon 6 


k k k k k k 


gggtgctctgtgctgatcacactctcttttctacagGGCTCCTGCACTGACTCCTGAGGATACTTTGGGATTGGTCTTCGTTCTTCTGTAATGCCTGTTT 

lyLeuLeuHis* * * 


ATCCTTGCTCAGAATCCCAACTGCCTGTCAGCCTGATGCCCTCTGAGATCAGAGTCCTACAGTGACTCTGACAAAGTCGCCAGGTCACCTCCTGTGACCC 
CCACCTTGAGTATCTTACTGCAATGGTGCTTCCTGCACTGACCCCGGAGCCTTTGCCTGTGTGCTGCCAGCTGCATCTGCTGAGACGCCAAGGGGTTTTT 
CTGTTTCCCTTCTTCCTC CATAGACTGTT CAAGAGAAACAC T TGAAGC CAT TTGC CTGAGTATAGAGATTTTATCATAATAAACATGATTATGAGTTAC C 
TGTatcctgaacctccttaaatgagcagagataggaaaccactgtagaatgaaggaacatattttggggaacctgggaccagaaggaagagtttcttctt 
gaaaggagactagaagcctcttggggtgccatataagagtgagcaaaggagatagaaaattaattcaatagtcatgtccttcctggttctttagtattga 
cgtttggtgcagtggccttaggatgtgcccgtetctcttccagtttggtgagtactgtataagtaagcatggtggaagtgtttgttgactgatatagtga 
cccctggtcactgatgtttcaaatatatcctggcaagtcacattgatcaaggtaaatttttattttttagaaagtataaccagtaataaaagtacatttt 
tggttttaaatgatagcaatccaacacaaatttatttattttttcctgttagaaagaagcctagggtcaagttgatttcagaagtatctagatgcagata 
ttcagtgagatcttcatctctctcttgtctttctgttgtgtttctgtctgtgtgcctttctctctttctctgtctctgtttgtccccatatacttctttc 
agtgtatctctctgtatctttttctctgtttcccttgttgcctttcactgcattgttgccccaccccctctctctctcattttgatctgtctgcactttt 
gtegetatttatctgcatctctttatctcatctctgtttctctgtgtatgtgtgtgcgtgtgtgtctgttatatgtgtctaccagagttttaaacatttt 
aggaaagattctgattggcccaccctgggtaacttgcgcaactctcaaaaacgtcatcctgtacaggggaTGGAGACAACCAGTGTTGTGCCCATCACTC 
TTATTAGGAACAGGAACTGGATTTTACAGATCCTTTTGTTGCTGGAAATGACTTCCTAGCGTGGTAATGTCCACATCTGCCACATGGAGCACCTCCAGgt 
tccagatacccttcacagtggagtggtgtcggggaagtttgtttcttgggaccccgcargacaaatggcagagctccctctggttctaaggtccctcctt 
atggggtgccagctcagactcatcccatcctttgtctcccatacagtcacgtgatggcatctgctgagctggaatctcagaggacagaattc 
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ELA-DQB 

ELA-DQB 
gene fragment 


^exon 3 boundary 

V E P T V T V 

ccccgggtaccaaggaggagtctgcccgtgtggagacttgctgtgtggtttcacatctcactgtcttttcctcccctcAGTGGAACCTACGGTGACTGTC 100 

SPSRTEALNHHNLLVCSVTDFYPGRIKVRWFRND 
TCCCCATCCAGGACAGAGGCTCTAAACCACCACAACCTGCTGGTCTGCTCAGTGACAGATTTCTATCCAGGCCGGATCAAAGTTCGATGGTTCCGGAATG 200 

ccagagttttaaacatttt.aggaaagattctgattggcccaccct.gggtaacttgcgcaactct.caaaaacgtcatcctgtac 


ELA-DQB 
gene fragment 


QEETAGVVSTPLIRNGDWTFQILVMLEMTPQRG 



ELA-DQB 
gene fragment 



EHPSLQSPITVQW 

GCCTCCAGAGCCCCATCACAGTGCAGTGGCGTaagggacaatttgtttcctttcactgtgggccccac 400 


Fig. 2. Evidence for gene duplication and subsequent deletion of DQB genes. The functional exon 3 sequence of ELA-DQB was 
aligned with a novel gene fragment identified at the intergenic region of the ELA-DQB gene. 


AAT GAC CAG GAG GAG ACA GCT GGC GTT-3', MZ2: 5'-TCG GGG 
GCT TCG TGC TGG GGC TCA TCT TCC-3'. The (31 sheet probe was 
generated by polymerase chain reaction using oligonucleotide primers GH28 
and GH29 as described previously (Gyllensten et al., 1990; Szalai et al., 
1993). Nucleotide sequencing was done using either an ABI 373S machine 
(Applied Biosystem, CA, USA) or an ALF-Express machine (Pharmacia Bio¬ 
tech, Tokyo) using appropriate recommended sequencing chemistries. Nu¬ 
cleotide sequence data was compiled with Genetyx Mac Software (System 
Software, Tokyo), and homology searches were done using the GenBank 
Blast-N program. Nucleotide sequences reported in this manuscript were 
deposited in nucleotide databases, DDBJ, EMBL, and GenBank, at the 
accession numbers AB106862 and AB106863. 


Results and discussion 

Two lambda clones, ?ons07 and lms08, were isolated from 
the library following probing with the p 1 sheet probe. The pres¬ 
ence of DQB gene in ?ons07 was confirmed by PCR using the 
primers GH28 and GH29 (Gyllensten et al., 1990) and probing 
with MZ1 and MZ2. Nucleotide sequencing confirmed that 
clone-7ms08 contained the DRB gene. In this report, the char¬ 
acterization of XmsOl is described. 

Nucleotide sequence of the putative pi sheet region is 
shown in Fig. 1A. The 5' boundary of exon 2 starts with the 
third codon of the p 1 sheet domain which is conserved in mam¬ 
mals (Groenen et al., 1990, Scott et al., 1991). The 221 nucleo¬ 
tides of this exon are consistent with a nucleotide sequence pre¬ 
viously reported as an allele of the DQB gene (W23: L08747.1 
Szalai et al., 1993). The nucleotide sequence of 3,592 bp cover¬ 
ing the p2 sheet, connecting peptide, transmembrane, cytoplas¬ 
mic tail and 3' untranslated region is shown in Fig. IB. As com¬ 
pared with the cDNA sequence reported previously (Szalai et 
al., 1994a), the genomic region comprised four exons and three 
introns. There are no obvious mutations causing frame shifts, 
or change of putative protein structure and splicing machin¬ 
ery. 


The equine DQB cDNA sequence reported here is 8 resi¬ 
dues longer in the cytoplasmic domain than the HLA DQB1 
cDNAs (Szalai et al., 1994b). This configuration is similar to 
the H2 IAp chain (Larhammar et al. 1983b; Malissen et al., 
1983). The difference may be explained by an alternative splic¬ 
ing site in the exon 5 region. In the horse DQB gene, splice 
acceptor and donor sites of the exon appear to be intact, and the 
variant branch sequence matches well with the mammalian 
consensus sequence. 

A novel gene fragment showing about 60% identity to the 
exon 3 region of the ELA-DQB gene was identified in the 
intergenic region about 800 bp downstream of the putative poly 
(A) signal. The gene fragment is the truncated 5 7 half of exon 3 
region generated by deletion and has the same sense configura¬ 
tion as the ELA-DQB gene (Fig. 2). The same gene fragment 
was also identified in the HLA-DQB2 region as judged by 
sequence similarity and deletion break point. This suggests that 
gene duplication and subsequent deletion of DQB genes took 
place before the divergence of the mammalian orders Archonta 
and Ferugulata some 74 Myr ago (Kumar and Hedges, 1998) as 
shown in Fig. 3. 

Two independent EST’s were predicted to be transcribed 
from the region surrounding the gene fragment identified 
above. Both EST’s had overall identity of 91 %, but had several 
deletions, insertions and substitutions. AK097297.1 was pre¬ 
dicted to encode 261 amino acids, whilst XM_167051.2 en¬ 
coded 131 aa (GenBank database, February 2003). They share 
an identical amino terminus but diverge after the first 60 amino 
acids due to a frame shift. This indicated that the EST’s had a 
unique structure. They were composed of three regions: a 
unique amino terminal region, the DQB exon 3 related region 
and a HERV-K LTR related region (Fig. 4). It was postulated 
that HERV emerged in primates 30 Myr ago (Sverdlov, 2000). 
In turn, this fusion gene resulted in the integration of the 
HERV-K LTR into the HLA-DQB gene fragment. Although no 
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Fig. 3. A model for generation of the current 
organization of the DQB region. Tandem duplica¬ 
tion of the DQB region took place and then a large 
deletion between the 3' end of the first DQB gene 
and the exon 3 region of the second DQB 
gene occurred sometime after early mammals 
emerged. 


EX3 





DQA DQB 


Gene fragment 


■ 1 MM —D-D- 


1 kb 


Horse 


Human 


Fig. 4. Identification of the novel gene frag¬ 
ment proximal to the DQB gene and expression of 
the element as a fusion gene. Hatched box indi¬ 
cated the novel gene fragment identified at the 
intergenic region of the HLA-DQB2 gene. Two 
EST’s were predicted to be transcribed as a fusion 
gene of the gene fragment and HERV-k-LTR. 



exon3 


exon4 


exon5 


exon6 pA 


i 

f= 


Novel EST 



AK097297.1 



XM 167051.2 



The unique region 


HERV LTR 


typical transactivation signals such as hormone responsive ele¬ 
ments (HRE), enhancers (GTGCTAAG element) and promot¬ 
ers (-TATAA-) were identified in the LTR region, the EST’s 
predict that they use the poly(A) signal found in U5 related 
region in the LTR. Several candidates for the promoter of the 
EST’s were identified in the -180 to -20 region. Nucleotide 
sequences resembling an AP-1 site (-CTGACTCC-) were also 
identified in the -780 region (see AL2789.11). Recently, Vino¬ 
gradova et al. (2001) reported that each human HERV-KLTRs 
is transcribed differently in different cells or tissues, and tran¬ 
scriptional behavior of different LTRs is different in the same 
cell line or tissue. It will be interesting to investigate whether 
the EST is driven by read-through from the HLA-DQB genes or 
by independent elements predicted in the region. 
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Abstract. Comparative biochemical and histopathological 
data suggest that a deficiency in the glycogen branching enzyme 
(GBE) is responsible for a fatal neonatal disease in Quarter 
Horse foals that closely resembles human glycogen storage dis¬ 
ease type IV (GSD IV). Identification of DNA markers closely 
linked to the equine GBE1 gene would assist us in determining 
whether a mutation in this gene leads to the GSD IV-like condi¬ 
tion. FISH using BAC clones as probes assigned the equine 
GBE1 gene to a marker deficient region of ECA26ql2-*ql3. 
Four other genes, ROB02, ROBOl, POU1F1, and HTR1F, 
that flank GBE1 within a 10-Mb segment of HSA3pl2—>pl 1, 
were tightly linked to equine GBE1 when analyzed on the Tex¬ 
as A&M University 5000 rad equine radiation hybrid panel, 


while the GLB1, MITF, RYBP, and PROS1 genes that flank 
this 10-Mb interval were not linked with markers in the GBE1 
group. A polymorphic microsatellite (GBEmsl) in a GBE1 
BAC clone was then identified and genetically mapped to 
ECA26 on the Animal Health Trust full-sibling equine refer¬ 
ence family. All Quarter Horse foals affected with GSD IV were 
homozygous for an allele of GBEms 1, as well as an allele of the 
most closely linked microsatellite marker, while a control horse 
population showed significant allelic variation with these 
markers. This data provides strong molecular genetic support 
for the candidacy of the GBE1 locus in equine GSD IV. 

Copyright©2003 S. Karger AG, Basel 


One of the least common glycogenoses in humans is an 
inherited deficiency of the glycogen branching enzyme (GBE) 
known as glycogen storage disease IV (GSD IV) (Andersen, 
1956). GSD IV in humans causes a variety of clinical presenta¬ 
tions affecting nervous system, skeletal muscle, cardiac muscle, 
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and/or liver tissues (DiMauro and Lamperti, 2001). We are 
investigating the basis of a recently described fatal neonatal dis¬ 
order in the American Quarter Horse that closely resembles 
GSD IV (Render et al., 1999; Valberg et al., 2001). Common 
elements of GSD IV in humans, cats and horses are the accu¬ 
mulation of unbranched polysaccharide in tissue sections vi¬ 
sualized by periodic acid Schiff (PAS) staining, and a profound 
decrease in GBE1 activity as measured in an indirect enzymat¬ 
ic assay (Fyfe et al., 1992; DiMauro and Lamperti, 2001; Val¬ 
berg et al., 2001). No GBE1 protein is detectable in liver 
extracts from affected foals using Western blotting procedures 
(Valberg et al., 2001). GSD IV has an autosomal recessive pat¬ 
tern of inheritance in both horses and cats (Fyfe et al., 1992; 
Valberg et al., 2001). A number of mutations in the human 
GBE1 gene have been identified that moderately or severely 
impact GBE1 enzyme activity and may account for some of the 
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Table 1 . Equine PCR primers and DNA markers used in this study 


Locus 

Forward primer 

Reverse primer 

PCR 

BLAST results 3 

Accession no. 

Application 




product 
size (bp) 

(e value / accession no. / locus) 



GBEexon2 

CCAACATGATGGTTCTGACC 

GAGTGCGTGAGTAATGAGTC 

232 

6e-80/ NMJXX) 158.1/ human 

AY301014 

cDNA sequencing 





GBEl mRNA 


RH mapping 

GBE GP1 

GGGAATTTCTTCCCATGAA 

CCAAGGCCTTTGATTCTTGG 

124 

2e-53/ NM_000158.1/ human 

AY301015 

BAC isolation 





GBEl mRNA 



GBE GP2 

CGCTTATGCAGAGAGCCATG 

GAGTGCGTGAGTAATGAGTC 

177 

6e-64/ NMJXX) 158.1/ human 

AY301016 

BAC isolation 





GBEl mRNA 



GBE 4es 

CGCATTCGAGAATTCGCAC 

GAAATCGCGTTATTACCGAC 

109 



BAC isolation 

GBEmsl 

GGAAAGACATCTTGCACCAC 

TGCACCCATAACGCAAAG 

230 


AY301017 

Exon sequencing 
Linkage mapping 
Allele frequency 

RYBP 

GCACATGGGAATTGTGAAAA 

CATGGCAAACCAGAAATCATC 

164 

3e-37/ NM_012234.3/ human 

AY301018 

Exon sequencing 





RYBP mRNA 


RH mapping 

ROBOl 

TCCAAGRCACAGCTGGARG 

CAGTTTCCTCTAATTCTT 

231 

le-71/ NM_133631.1/ human 

AY301019 

Exon sequencing 





ROBOl mRNA 


RH mapping 

ROB02 

AACCASTTACAAYAGTTCCAG 

GCAGATCGACAGCCAATTC 

107 

le-13/ XM_031246.7/ human 

AY301020 

Exon sequencing 





ROB 02 mRNA 


RH mapping 

POU1F1 

CTGGAGAGACACTTTGGAGAA 

TC AC Y CGTTTTTCTCTCTG Y C 

121 

le-48/ AF035585/ canine 

AY301021 

Exon sequencing 





POUF1 mRNA 


RH mapping 

HTR1F 

GTGATGCCCTTCAGCATTGTG 

TGC AATY CTACTTGCTTGTCT 

433 

e-150/ NM_000866.1/ human 

AY301022 

Exon sequencing 





HTR1F mRNA 


RH mapping 

PROS 1 

TCGGATAVAGGCCHTAAGTCT 

CTGGAAGGCCACCCAGGTAWG 

208 

5e-18/ NM_174438.1/ bovine 

AY301023 

Exon sequencing 





PROS 1 mRNA 


RH mapping 

GLB1 

CGAATGTGAACATGTGAG 

AGTACTTGACTGTGAGCCC 

83 


AF130765 

RH mapping 

MITF 

AACAACCTCGGAACCGGGACC 

AGCAACAAATGCCGGTTGGC 

292 


AF401626 

RH mapping 

UMNel53 

GTGCTGGAGTGAGCTGACC 

ATCCAAATCGGAGACCATATG 

135 


AF536265 

RH mapping 

Allele frequency 

UMNe66 

GAATCCCATCTTTCCTTTCAG 

ACGTGGAGAATTATCCTGCG 

124 


AF191699 

RH mapping 

Linkage mapping 
Allele frequency 

UMNel68 

CACCAAACCCCACTGAATTC 

CACTACCCTTCCCCTACGTTCC 

149 


AY391308 

Allele frequency 

LEX073 

CCCTAGAGCCATCTCTTTACA 

CAGATCCAGACTCAGGACAG 

250 


AF213359 

Allele frequency 

COR092 

GGCAAGAGCCAGGTATTTTC 

ACTGCTTGGACGAAACTGAG 

190 


AF154945 

Allele frequency 

Blast results with the highest scores and locus matches for analyses performed 04/24/03 

are provided only for gene-based DNA markers with equine 

sequence information obtained in this report. 







heterogeneity of clinical signs (Bao et al., 1996; Moses and Par- 
vari, 2002). 

The human GBE1 gene is very large (approximately 250 
kb), with 16 exons, many introns longer than 20 kb, a cDNA of 
approximately 3 kb, and an encoded protein of 702 amino 
acids (Thon et ah, 1993; Moses and Parvari, 2002). We ini¬ 
tiated a search for DNA markers closely linked to the equine 
GBE1 gene as a first step in defining the molecular genetic basis 
of the GSD IV-like disorder in Quarter Horses. The equine- 
human comparative genome maps available for this purpose 
are based on Zoo-FISH (Raudsepp et al., 1996), mapping of 
equine genes on a somatic cell hybrid panel (Caetano et al., 
1999), FISH of BAC markers containing equine genes (Milen- 
kovic et al., 2002), and a first generation comprehensive radia¬ 
tion hybrid and comparative map (Chowdhary et al., 2003). 
Based on these maps, GBE1 and other genes on HSA3pl2 were 
suspected to lie on either ECA16 or 19. In this report we de¬ 
scribe the unexpected mapping of the equine GBE1 gene, as well 
as four other genes from a 10-Mb segment of HSA3pl2->pl 1, 
to ECA26ql2—>ql3. A microsatellite from the GBE1 BAC, as 
well as another microsatellite marker from the region, were then 
used to verify the candidacy of the GBE1 gene in causing the 
GSD IV-like disorder in American Quarter Horses. 


Materials and methods 

Equine GBE1 cDNA sequence 

Partial cDNA sequence from the equine GBE1 gene was obtained by 
RT-PCR. In brief, mRNA was isolated from skeletal muscle tissue of a con¬ 
trol Quarter Horse using the Invitrogen Micro-FastTrack 2.0 kit. cDNA was 
prepared using the Invitrogen Superscript II RT kit with random hexamers 
as primer. GBE1 primers listed in Table 1 were used to PCR amplify seg¬ 
ments of the horse GBE1 cDNA (GBEexon2, GP1 and GP2). PCR products 
were resolved on 1 % agarose gels, purified with Qiagen kits, and sequenced 
on an Applied Biosystems 3100 automated DNA sequencer. All DNA 
sequences were manually edited with Sequencher (Gene Codes Corp) and 
compared with GenBank entries by BLAST searches (blastn and blastx). 

Equine gene sequences 

PCR primers derived from predicted exons of the human, mouse and rat 
sequences were used to generate equine sequences for the RYBP, PROS1, 
ROBOl, ROB02, POU1F1, and HTR1F genes from a genomic DNA tem¬ 
plate (Table 1). Following agarose gel electrophoresis, PCR products were 
purified, sequenced, edited and compared to GenBank entries as described 
above. 

GBEl BAC isolation 

The initial equine cDNA sequence was used to design the horse specific 
GBEl PCR primers GP1 and GP2 (Table 1), which were used to screen the 
INRA Horse BAC library (Godard et al., 1998; Milenkovic et al., 2002) (Ta¬ 
ble 1). This allowed the isolation of BACs GP1 and GP2. Sequence from 
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BAC GP1 subsequently generated the primer pair 4es used to obtain BAC 
4es. 

GBE1 microsatellite identification 

A Southern blot of EcoRl digested DNA from BACs GP1, GP2 and 4es 
was screened for potential microsatellites with a [ 32 P] 5' end labeled oligo 
[dCA] 15 probe. The blot indicated BAC 4es contained potential CA:GT 
repeat sequences in DNA fragments less than 5,000 bp in length. The BAC 
4es EcoRl DNA fragments were ligated into the pBluescript vector and plas¬ 
mid subclones containing potential CA repeats were then identified by 
screening colony lifts with the [ 32 P] 5' end labeled oligo [dCA]is probe. Plas¬ 
mid DNA was isolated from positive colonies and the inserts were se¬ 
quenced. In this manner, GBEmsl, a microsatellite within an intron of the 
GBE1 gene was identified. 

Microsatellite polymorphism and genetic linkage mapping 

Twelve stallions of the Equine Genome Mapping Workshop Internation¬ 
al Reference Family (Guerin et al., 1999), three horses (1 stallion, 2 mares) 
from the parental generation of the Animal Health Trust full-sibling refer¬ 
ence family (Swinburne et al., 2000), nine Quarter Horse foals diagnosed 
with GSDIV on the basis of characteristic histopathology and minimal GBE 
activity (Valberg et al., 2001), and a control Quarter Horse population con¬ 
sisting of 55 unrelated individuals from a wide geographic distribution, were 
used for determination of microsatellite polymorphism. PCR reactions were 
performed in 15-pl volumes consisting of the following reagents: ~25 ng 
DNA, lx PCR buffer (Qiagen), 1.5 mM MgCl 2 ,25pM each of dCTP, dGTP, 
and dTTP, 6.25 pM dATP, 0.125 pCi [a- 32 P]dATP, 0.45 U Hot StartTaq 
polymerase (Qiagen Inc., Valencia, CA), and 5 pmol of each primer. The 
PCR conditions, using a MJ Research PTC 100 thermocycler, were: initial 
2 min denaturation at 92 0 C; 30 cycles of 92 °C for 30 s, annealing tempera¬ 
ture of 56 0 C for 30 s, and 72 0 C for 30 s, and a final 5 min extension at 72 ° C. 
These reaction products were electrophoresed through 7 % acrylamide dena¬ 
turing gels on BioRad SequiGen GT 38 x 50 cm plate sequencing gel units, in 
the presence of 1 x TBE, and allele sizes detected through autoradiography. 

After initial demonstration of GBEmsl polymorphism the entire Animal 
Health Trust three generation full-sibling reference family consisting of a 
total of 71 members was analyzed. The chromosomal location of GBEmsl 
was identified by two-point linkage analysis in reference to existing markers 
on the full-sibling family map using CRIMAP software (Swinburne et al., 
2000). Comparisons of the allele frequencies of equine microsatellites 
GBEmsl, UMNe66 (Roberts et al., 2000), UMNel53 and UMNel68 (Mick- 
elson et al., 2003), LEX073 (Bailey et al., 2000) and COR092 (Tallmadge et 
al., 1999) between the affected Quarter Horse foal and control horse popula¬ 
tions were performed using a Chi-squared test. 

Fluorescent in situ hybridization (FISH) 

FISH and chromosome preparation was performed as previously de¬ 
scribed (Lear et al., 1998). Briefly, approximately 1 pg of DNA from BAC 
clone GP2 containing a segment of the GBE1 gene was labeled with biotin 
using the BioNick Labeling System (Invitrogen) and hybridized to horse lym¬ 
phocyte metaphase spreads. The bound probe was detected with avidin-con- 
jugated fluorescent antibodies (Ventana). The results were analyzed using a 
Zeiss Axioplan2 fluorescent microscope and Cytovision®/Genus™ applica¬ 
tion software version 2.7 (Applied Imaging). 

Radiation hybrid mapping 

Equine specific PCR primer pairs (Table 1) for gene and microsatellite 
markers were derived and PCR conditions were optimized to eliminate 
interference from hamster DNA. The 5000 rad whole genome radiation 
hybrid panel comprising 92 hybrid cell lines was typed by PCR as previously 
described (Chowdhary et al., 2002, 2003). Markers were typed in duplicate, 
separated by electrophoresis on 2.5 % agarose gels and scored manually. The 
RHMAPPER software (Slonim et al., 1997) was used to assign markers with 
unknown locations to individual chromosomes on the current equine radia¬ 
tion hybrid map at lod > 11.0 (Chowdhary et al., 2003). Following this, the 
RH2PT program within the RHMAP 3.0 software (Boehnke, 1992; Lunetta 
et al., 1995) was used to define RH groups within individual chromosomes at 
lod >7.0. 


Results 

Equine sequence determination 

Partial cDNA sequence for the equine GBE1 and partial 
genomic DNA sequences from exons of the equine RYBP, 
ROBOl, ROB02, GBE1, POU1F1, and HTR1F genes were 
determined. In addition, an intronic region of GBE1 that con¬ 
tained microsatellite GBEmsl was derived from subcloning 
BAC 4es. GBEmsl was subsequently found to lie in intron 10, 
196 bases 5 prime of exon 11. All novel equine sequences were 
submitted to GenBank and the accession numbers are provided 
in Table 1. 

FISH 

The localization of the equine GBE1 gene to ECA26ql2—> 
ql3 by FISH with BAC GP2 is shown in Fig. 1. These results 
were subsequently confirmed through FISH of additional 
BACs (GP1 and 4es) containing the equine GBE1 gene that 
were also isolated from the INRA horse BAC library (not 
shown). 

Radiation hybrid mapping 

Genes on HSA3p chosen for radiation hybrid mapping in 
the horse are shown in Table 2. Statistically significant link¬ 
age was established between GBE1 and ROBOl, ROB02, 
GBEmsl, POU1F1, and HTR1F. GBE1 was linked with lower 
statistical support to ECA26 microsatellite loci UMNe66 and 
UMNel53. The HSA3p genes PROS1, RYBP, MITF, and 
GFB1 were not linked to any of these above loci. 

Genetic linkage 

Three alleles of microsatellite GBEmsl were identified in 
the stallions of the International Equine Genome Mapping 
Workshop Reference family. Although both mares in the New¬ 
market full-sibling reference family were homozygous for 
GBEmsl allele 2, the stallion was heterozygous exhibiting 
alleles 2 and 3. Genetic linkage analysis on the entire Newmar¬ 
ket reference family established linkage between GBEmsl and 
microsatellite UMNe66 on ECA26, with a lod score of 7.76 at 
an observed recombination fraction of 0.05. 

Microsatellite allele distributions 

Allele distribution data for six equine microsatellites in 
GBE1 deficient foal and control Quarter Horse populations are 
provided in Table 3. All nine affected foals were homozygous 
for allele 3 of GBEmsl as well as allele 3 of UMNe66. Although 
GBEmsl allele 3 and UMNe66 allele 3 were the most common 
alleles in the control population, this group of horses displayed 
significant allelic heterogeneity (four and three alleles respec¬ 
tively) with both these markers. Chi squared tests demon¬ 
strated the allele distributions for both GBEmsl and UMNe66 
to be significantly different between the affected foal and con¬ 
trol populations ( P < 0.005 and P < 0.025 for GBEmsl and 
UMNe66 respectively). ECA26 microsatellite UMNel53, 
linked to UMNe66 at a distance of 8 cM (Mickelson et al., 
2003), had allelic variation in both affected foal and control 
populations, and allele distributions that were not significantly 
different between groups. ECA19 microsatellites LEX073 and 
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COR092, as well as the as yet unmapped microsatellite, 
UMNel68, also had allelic variation in both affected foal and 
control populations, and allele distributions that were not sig¬ 
nificantly different between groups. 

Discussion 

A major result obtained in this study is the description of pre¬ 
viously undetected synteny conservation between HSA3pl2-A 
pi 1 and ECA26ql2-^ql3 that was provided through a combi¬ 
nation of FISH, radiation hybrid and genetic linkage mapping 
techniques of the equine GBE1 and flanking genes. This map 
assignment proved very useful in allowing us to develop and 
choose polymorphic markers for generation of the second major 
result, which was the demonstration of genetic association 
between markers in and near the GBE1 locus with the GSD IV 
condition in American Quarter Horses. 


A new map of ECA26, containing equine orthologs of the 
HSA3pl2-> 11 genes R0B02, ROBOl, GBE1, POU1F1 and 
HTR1F, is provided in Fig. 2. The detection of this new con¬ 
served syntenic segment in the horse adds information to the 
equine-human comparative map, which previously indicated 
only that ECA26 shares synteny with HSA2 lq (Raudsepp et al., 
1996; Fear et al., 1998; Godard et al., 2000; Milenkovic et al., 
2002; Chowdhary et al., 2003). Synteny of segments corre¬ 
sponding to HSA3 and HSA21 is present in chicken, marsu¬ 
pials and the majority of the eutherian mammals (Maccarone 
et al., 1992; Chowdhary et al., 1998; Richard and Dutrillaux, 
1998; Smith et al., 2000; Chowdhary and Raudsepp, 2001). 
Even in the highly rearranged dog, mouse and elephant karyo¬ 
types the two segments preserve tandem synteny (Yang et al., 
1999; Smith et al., 2000; Yang et al., 2003). As far as is known, 
segments homologous to HSA3 and HSA21 are on separate 
chromosomes only in Old World monkeys and great apes. This 
is attributed to a fission event that took place after divergence 


Table 2. Two point linkage analyses between 
GBE1 and other DNA markers typed on the 
equine radiation hybrid panel 


Loci' Result 


Equine location (reference) 


Human Human 

cytogenetic sequence 

location 1 ’ location (Mb) 


GLB1 

Not linked to GBE1 

ECA16 (Caetano et al., 1999) 

HSA3p21 

32.8 

MITF 

Not linked to GBE1 

EC A16q 14-q 16(Terry et al., 2002) 
ECA19ql5 (Milenkovic et al., 2002) 
ECA16 (Blechynden et al., 2002) 

HSA3pl4-pl2 

69.8 

RYBP 

Not linked to GBE1 

Unknown 

HSA3pl3 

72.2 

ROB02 

Linked to GBE1 

ECA26 (this report) 

HSA3pl2 

77.4 

ROBOl 

Linked to GBE1 

ECA26 (this report) 

HSA3pl2 

78.4 

GBEmsl 

Linked to GBE1 

ECA26ql2-13 (this report) 

N/A 

81.3 

POU1F1 

Linked to GBE1 

ECA26 (this report) 

HSA3pl 1 

87.1 

HTR1F 

Linked to GBE1 

ECA26 (this report) 

HSA3pl2 

87.9 

PROS1 

Not linked to GBE1 

ECA19q25 (Milenkovic et al., 2002) 

HSA3pl 1 

94.9 

UMNe66 

Suggestive of linkage 

ECA26 (Roberts et al., 1999) 

N/A 


UMNel53 

Suggestive of linkage 

ECA26 (Mickelson et al., 2003) 

N/A 



Analysis of these loci on the horse-hamster radiation hybrid cell panel using PCR primers reported in Table 1 
was as described in Materials and methods. LOD scores greater than 11 are considered proof for linkage to 
GBE1, with LOD scores greater than 7 indicative of suggestive linkage to GBE1. 

N/A = not available. 


Table 3. Allele distribution of microsatellite 
markers observed in populations of control 
horses (C) and GSD IV affected foals (A) 


Microsatellite 

Number of horses 

GBEmsl 

UMNe66 

UMNel53 

LEX073 

COR092 

UMNel68 

C-A 

50-9 

C-A 

49-9 

C-A 

50-9 

C-A 

48-9 

C-A 

49-9 

C-A 

55-9 

Allele 1 

14-0 

5-0 

5-0 

2-0 

11-4 

8-0 

Allele 2 

27-0 

21-0 

17-3 

9-2 

31-8 

2-1 

Allele 3 

58-18 

72-18 

40-9 

10-0 

28-1 

18-3 

Allele 4 

1-0 


3-1 

23-4 

24-5 

7-1 

Allele 5 



4-0 

46-11 

4-0 

2-0 

Allele 6 



8-5 

6-1 


26-5 

Allele 7 



3-0 



33-5 

Allele 8 



4-0 



9-1 

Allele 9 



16-0 



5-2 


The indicated microsatellite markers were analyzed on the number of control horses (C) and GSD IV affected 
foals (A). That different numbers of control horses are reported for different microsatellites indicates PCR failure 
for several control horse DNA samples. Genotypes for each horse were obtained by gel electrophoresis and 
autoradiography as described in Materials and methods, and the number of alleles of each size in the control and 
affected foal groups are reported. 
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of Old and New World monkeys from a common ancestor 
(Muller et al., 2000). Considering these facts, it has been 
intriguing not to previously discover this synteny in the horse, 
even though it was found in Hartmann’s mountain zebra (Ri¬ 
chard et al., 2001). Findings of the present study clearly show 
that horse is not an exception, and genes from HSA21 and 
HSA3 are syntenic on ECA26. 

RYBP and PROS 1, which closely flank the approximately 
10-Mb interval on HSA3p comprising the GBE1 -linked equine 
genes, are not contained in the new ECA26 linkage group. 
RYBP (HSA3pl4.1) has been mapped to ECA16 along with 



Fig. 1 . FISH localization of equine GBE1 to 
ECA26ql2-4 13. Isolation of BAC GBE GP2, 
labeling with biotin-dUTP, and detection were as 
described in Materials and methods. 


most other HSA3p genes reported to date (Caetano et al., 1999; 
Blechynden et al., 2002; Milenkovic et al., 2002; Chowdhary et 
al., 2003). PROS1 (HSA3pll) on the other hand has been 
assigned by FISH to ECA19 along with several genes from 
HSA3q26.3-> 3q28 (Milenkovic et al., 2002; Chowdhary et al., 
2003). All other HSA3q genes thus far mapped are located on 
ECA16 (Caetano et al., 1999; Blechynden et al., 2002; Milen¬ 
kovic et al., 2002; Terry et al., 2002; Chowdhary et al., 2003). 
That MITF (HSA3p 14.1) has been assigned to both ECA16 and 
ECA19 by different investigators (Milenkovic et al., 2002; Ter¬ 
ry et al., 2002) suggests that additional data are required to con¬ 
clusively determine all the synteny relationships that exist 
between HSA3 and ECA16, 19, and 26. All reports to date do 
however indicate considerable rearrangement in the order of 
mapped genes between the conserved segments on HSA3 and 
ECA 16 (Milenkovic et al., 2002; Chowdhary et al., 2003). 

At this writing thirteen clinical cases of GSD IV have been 
identified in American Quarter Horses and American Paint 
horses. However, the large number of half siblings (>2,000) 
born to the sires and dams of affected foals, and the similarity 
of clinical signs of GSD IV to other neonatal diseases, suggests 
that GSD IV may be a fairly common but poorly recognized 
fatal disease in Quarter Horse-related breeds. Visual analysis of 
the pedigree containing foals affected with the GBE1 deficiency 
condition indicates that all can be traced back within nine gen¬ 
erations to a common ancestor (Valberg et al., 2001). The pedi¬ 
grees as well as GBE1 enzyme activity data in dams and some 
half-sibs of affected foals also suggest an autosomal recessive 
pattern of inheritance. Thus, affected foals would be expected 
to be homozygous for alleles of DNA markers that are suffi¬ 
ciently close to the causative gene to have been derived identi¬ 
cally by descent from the founder. All GSD IV affected foals 
were indeed homozygous for an allele of GBEmsl, as well as an 
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Fig. 2. Cytogenetic, genetic linkage, and comparative maps of ECA26. Left: Loci on ECA26 mapped by FISH. Center: Loci and 
genetic distance in cM determined on the Animal Health Trust reference family. Right: Human synteny map and distances in Mb 
on HSA3 and HSA21 were taken from the Human Genome Project. Equine mapping data used in this figure were adapted from 
Caetano et al. (1999), Swinburne et al. (2000), Milenkovic et al. (2002) and Chowdhary et al. (2003). 
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allele of the approximately 5 cM distant marker UMNe66 (Ta¬ 
ble 3). Although the control horse population also contained 
the same GBEmsl and UMNe66 alleles that were present in 
the affected foals, the allele frequencies of each marker were 
significantly different between the two groups of horses. Fur¬ 
ther, an ECA26 microsatellite marker 8 cM distal to UMNe66, 
as well as markers on ECA19, did not demonstrate significantly 
different allele frequencies when the control and affected foal 
groups were compared, indicating that the results for GBEmsl 
and UMNe66 did not reflect population bias. 

The data presented here clearly support an association 
between the region of ECA26 containing the GBE1 gene and 
the GSD IV trait. Although full-length human and murine 


GBE1 cDNA sequences are known, other mammalian ESTs for 
this gene are uncommon, and our attempts to obtain the full- 
length equine GBE1 sequence using RT-PCR and cDNA 
libraries have not yet been successful. Further derivation of 
cDNA sequence and examination of the structure of the GBE1 
gene in affected and control horses is now warranted to identify 
the molecular genetic basis for this fatal condition affecting 
American Quarter Horses. 
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Abstract. Epitheliogenesis imperfecta (El) is a hereditary 
junctional mechanobullous disease that occurs in newborn 
American Saddlebred foals. The pathological signs of epithelio¬ 
genesis imperfecta closely match a similar disease in humans 
known as Herlitz junctional epidermolysis bullosa, which is 
caused by a mutation in one of the genes (LAMA3, LAMB3 and 
LAMC2) coding for the subunits of the laminin 5 protein (lam¬ 
inin a3, laminin (33 and laminin y2). The LAMA3 gene has 
been assigned to equine chromosome 8 and LAMB3 and 
LAMC2 have been mapped to equine chromosome 5. Linkage 
disequilibrium between microsatellite markers that mapped to 
equine chromosome 5 and equine chromosome 8 and the El 
disease locus was tested in American Saddlebred horses. The 
allele frequencies of microsatellite alleles at 11 loci were deter¬ 


mined for both epitheliogenesis imperfecta affected and unaf¬ 
fected populations of American Saddlebred horses by genotyp- 
ing and direct counting of alleles. These were used to determine 
fit to Hardy-Weinberg equilibrium for control and El popula¬ 
tions using Chi square analysis. Two microsatellite loci located 
on equine chromosome 8q, ASB14 and AHT3, were not in Har- 
dy-Weinberg equilibrium in affected American Saddlebred 
horses. In comparison, all of the microsatellite markers located 
on equine chromosome 5 were in Hardy-Weinberg equilibrium 
in affected American Saddlebred horses. This suggested that 
the El disease locus was located on equine chromosome 8q, 
where LAMA3 is also located. 

Copyright©2003 S. Karger AG, Basel 


Equine epitheliogenesis imperfecta (El) is a hereditary me¬ 
chanobullous neonatal disease characterized by missing epithe¬ 
lium on the skin and oral mucosa. El in the horse is an autoso¬ 
mal recessive disease (Butz and Meyer 1957). Lesions vary in 
size and location but usually consist of irregular patches of 
missing hair and epithelium on the legs and back exposing the 
underlying dermis (Thomson et al., 2001). The progression of 
this disease is always fatal and affected foals are generally 
euthanized soon after birth. 
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The first case of El in American Saddlebred horses reported 
to the American Saddlebred Horse Association (ASHA) oc¬ 
curred in 1975. Currently 34 verifiable cases have been re¬ 
ported to the ASHA. Construction of a partial pedigree for 
American Saddlebred horses showed a pattern of inheritance 
and a frequency of occurrence of El that was consistent with an 
autosomal recessive inheritance pattern (Lieto, 2001). 

Electron microscopic examination indicates that these le¬ 
sions were caused by a separation within the lamina lucida of 
the basal lamina in El affected American Saddlebred foals (Lie¬ 
to et al., 2002). There is a great deal of similarity between 
equine El and the Herlitz variant of a human hereditary disease 
known as junctional epidermolysis bullosa. In humans, there 
are four types of epidermolysis bullosa, simplex, junctional, 
hemidesmosomal and dystrophic, each of which has many cau¬ 
sative mutations of different types. These include amino acid 
changes, insertions/deletions and stop codons within the same 
gene (Baudoin et al., 1994; Anton-Lamprecht 1995; Christiano 
et al., 1996). The Herlitz variant of junctional epidermolysis 
bullosa (HJEB) is usually caused by a premature stop codon in 
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Table 1. Summary of yf analyses of Hardy 
Weinberg Equilibrium (HWE) of microsatellites 
tested in American Saddlebred horses 


Microsatellite 3 Equine chromosome 

assignment 15 

El unaffected population (N = 39) 

Affected population (N = 10) 

^HWE 

P HWE 

^HWE 

P HWE 

AHT24 

5p 16 

3.33 

0.504 

5.61 

0.132 

HTG15 

5q 

3.41 

0.492 

5.35 

0.294 

HTG20 

5p 

6.82 

0.234 

3.60 

0.608 

LEX04 

5p 

4.16 

0.527 

6.75 

0.240 

ASB14 

8q 

5.83 

0.560 

54.01 

2.08 x 10~ llc 

AHT3 

8q 

11.93 

0.155 

64.62 

1.34 x 10“ 12c 

AHT25 

8 p17—p16 

1.80 

0.877 

7.11 

0.130 

COR056 

8q 

7.07 

0.529 

11.17 

0.048 

ASB38 

27 

4.07 

0.540 

4.107 

0.534 

HMS18 

30 

7.33 

0.062 

6.31 

0.097 

LEX25 

30 

4.62 

0.465 

7.00 

0.220 


(Breen et al., 1997; Binns etal., 1995; Coogle et al., 1996a; Coogle et al., 1996b; Godard et al., 1997; Irvin et 
al., 1998; Lingren et al., 1999; Ruth et al., 1999; Shiue et al., 1999; Swinburne et al., 2000b; Guerin et al., 1999). 
b Chromosome locations based on physical (Swinburne et al., 2000a) and radiation hybrid mapping 
(Chowdhary et al., 2003). 
c Indicates P <0.01. 


one of the genes coding for the laminin 5 heterotrimer, which 
results in a lack of expression of the respective protein chain 
(Aberdam et al., 1994; Baudoin et al., 1994; Vidal et al, 1995). 
Based on the observed similarities between El and HJEB 
equine microsatellite markers located on the same chromo¬ 
somes as the three genes coding for the subunits of laminin 5 
were tested for association with the EL 


Methods and materials 

Epitheliogenesis imperfecta samples 

Blood and tissue samples from 10 El affected American Saddlebred foals 
were received by the University of Kentucky Equine Blood Typing Research 
Laboratory after diagnosis of El by local veterinarians. Genomic DNA was 
isolated from spleen or blood using the Puregene DNA isolation kit (Gentra 
Systems, Minneapolis, MN). 

Epitheliogenesis imperfecta unaffected samples 

Sera samples were chosen from a collection of American Saddlebred 
blood samples taken between 1990 and 2000 for parentage testing at the Uni¬ 
versity of Kentucky Equine Blood Typing Research Laboratory. These 39 
American Saddlebred horse samples were selected as El unaffected controls 
and were chosen so that no farm contributed more than one horse. Two 
hundred microliters of sera were removed from each tube and heated for 
10 min at 95 °C. The heated serum was then spun in a microcentrifuge at 
20,000 g for 15 min. The resulting supernatant was used as a template in 
subsequent PCR reactions. 

Markers and genotyping 

Data from the horse genome map and horse/human comparative map¬ 
ping were initially used to assign the putative equine chromosome locations 
of LAMA3 to ECA8 and of LAMB3 and LAMC2 to ECA5 (Raudsepp et al., 
1996; Swinburne et al., 2000a). Subsequently LAMB3 and LAMC2 were 
mapped to ECA5pl5 and ECA5pl7->pl6 respectively by Mariat et al. 
(2001). The LAMA3 gene was recently mapped to ECA8ql4->ql5 by 
Milenkovic et al. (2002). Eight microsatellite markers were selected which 
mapped to ECA5 and ECA8 as well as three control markers on ECA27 and 
ECA30. Dye labeled primers based on these markers were produced by Per¬ 
kin Elmer (South Plainfield, NJ) and used for genotyping analysis. The prim¬ 
er sets were placed into multiplexes (based on expected product size and 
fluorescent tag) of several primer sets each. These multiplexes were as fol¬ 
lows: (1) LEX25, LEX04 and ASB14; (2) ASB38 and HMS18; (3) AHT24 


and HTG20. AHT25, COR056, HTG15 and AHT3 were amplified in single 
reactions. 

PCR was performed on template DNA from each of the El affected and 
El unaffected samples. The reaction products were then sized using an ABI 
377 and analyzed using Genscan and Genotyper software (Applied Biosys¬ 
tems, Foster City, CA). For some loci not all of the individuals could be 
genotyped. 

Statistical analysis 

The gene frequencies were determined for the El unaffected population 
of American Saddlebred horses by genotyping and direct counting of alleles. 
Individuals with only one allele were assumed to be homozygous. This data 
was used to determine the expected genotypic proportions based on Hardy- 
Weinberg equilibrium for the El unaffected and El populations. Observed 
and expected genotypic frequencies were compared using y 2 analysis. P val¬ 
ues <0.01 were considered significant. This more stringent P value was used 
because multiple loci were being evaluated and in order to decrease the likeli¬ 
hood of falsely rejecting Hardy-Weinberg equilibrium by chance. This analy¬ 
sis compares the distribution of alleles for each microsatellite from the El 
affected American Saddlebred horses to the expected distribution of alleles in 
a population with ten individuals based on the distribution observed in the 
El unaffected American Saddlebred horses (n = 39). This methodology was 
previously used by Bailey et al. (1997) to demonstrate evidence for linkage 
disequilibrium between HTG8 and the gene for equine combined immunod¬ 
eficiency in Arabian horses. Due to the large number of alleles and the small 
number of individuals, some phenotypic classes were combined so that all 
expected classes were 1.00 or greater. 


Results 

In the present study we obtained genotyping data from elev¬ 
en microsatellites for ten El affected and 39 El unaffected 
American Saddlebred horses. Chi square analysis of the El 
unaffected American Saddlebred horses determined that the 
phenotypic classes for all of the microsatellites tested were in 
Hardy-Weinberg equilibrium (Table 1). Analysis of the distri¬ 
bution of alleles for El affected American Saddlebred horses 
revealed that ASB14 and AHT3 had genotypic proportions that 
were significantly different from those that were expected 
under Hardy-Weinberg equilibrium (ASB 14, % 2 = 54.014 and 
P = 5.2 x 10- 11 ; AHT3, % 2 = 59.87 and P = 4.1 x 1()- 12 ) (Ta- 
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Table 2. Allele frequencies for ASB14 and AHT3 in the El unaffected 
American Saddlebred horse population 


ASB 14 N = 30 a 

Frequency 

AHT3 N = 34 a 

Frequency b 

125 

0.38 

141 

0.32 

121 

0.33 

139 

0.04 

119 

0.12 

137 

0.16 

115 

0.05 

135 

0.01 

113 

0.12 

133 

0.03 



131 

0.00 



129 

0.26 



127 

0.16 


For these two loci not all of the individuals could be genotyped. 
b Allele frequencies do not sum to 1.00 due to rounding. 


ble 1). All of the remaining microsatellites tested were in Har¬ 
dy-Weinberg equilibrium for El affected American Saddlebred 
horses, specifically AHT24 and HTG20 on ECA5. The El unaf¬ 
fected American Saddlebred horse allele frequencies of micro¬ 
satellites ASB14 and AHT3 are shown in Table 2. In the 
affected individuals there was a statistically significant excess 
of homozygotes for both loci. For AHT3, all affected American 
Saddlebred foals were homozygous for allele 141. For ASB14 
the affected foals possessed predominantly allele 113; eight of 
ten foals were homozygous for this allele while two foals had the 
113/121 genotype. The three control microsatellites located on 
ECA27 and ECA30 had a normal distribution of alleles in both 
affected and unaffected foals. 

Discussion 

A partial pedigree constructed for American Saddlebred 
horses suggests that a single founder is responsible for the 
spread of the El mutation within the breed (Lieto, 2001). A 
single founder responsible for El in American Saddlebred 
horses would lead us to expect that all El affected American 
Saddlebred horses would be homozygous for the same causa¬ 
tive mutation. This genetic homogeneity would be similar to 
porcine stress syndrome in pigs and HYPP in Quarter horses 
(Fujii et al., 1991; Bowling et al., 1996). Markers located near 
the El locus are more likely to have a conserved haplotype 
through successive generations or meioses because recombina¬ 
tion events occur less frequently over short lengths of DNA. 
Haplotype preservation was observed in Arabian horses with 
the equine combined immunodeficiency locus and the HTG8- 
186 and HTG8-188 alleles (Bailey et al., 1997). 

The normal distribution of alleles for AHT24 and HTG20 
indicates that neither LAMB3 nor LAMC2 are the candidate 
disease loci. AHT24 was physically mapped to ECA5pl6 and 
HTG20 was mapped to within 6 cM of AHT24 (Swinburne et 
al., 2000a, 2000b). These microsatellites provide good coverage 
of the region containing LAMB3, ECA5pl5 and LAMC2, 
ECA5pl7->pl6 (Mariat et al., 2001). HTG15 and LEX04 also 
have a normal distribution of alleles. HTG15 was linkage map¬ 
ped to ECA5q 13 -» q 14 and LEX04 was placed on the p arm of 
ECA5, near AHT24, by radiation hybrid mapping (Swinburne 


et al., 2000b; Chowdhary et al., 2003). HMS69 was also typed 
since it has been located to a BAC containing LAMC2; however 
it was monoallelic in the American Saddlebreds tested and 
therefore was not informative (data not shown; Mariat et al., 
2001 ). 

Linkage mapping of ASB14 and AHT3 assigned these loci to 
ECA8 (Swinburne et al., 2000a). Analysis of ASB 14-113 and 
AHT3-141 alleles in American Saddlebred horses indicated 
that both of these microsatellite loci are in linkage disequilibri¬ 
um with the El disease locus. The ASB 14-113 and AHT3-141 
alleles are in phase in seven out of nine El American Sadd¬ 
lebred foals tested. This suggests that the mutation responsible 
for El occurred creating a haplotype of ASB 14-113, AHT3-141 
and El. Two foals were heterozygous for ASB 14 113 and 121, 
which indicates that sometime since the original causative 
mutation arose a recombination event occurred between 
ASB 14 and both AHT3 and the El disease locus. This implies 
that AHT3 is located more closely to the El disease locus than 
ASB 14. ASB 14 was placed on the q arm of ECA8, by radiation 
hybrid mapping, within ~ 35cR of LAMA3 (Chowdhary et al., 
2003). No recombination was observed between AHT3 and 
ASB 14 by linkage mapping (Swinburne et al., 2000b). The loca¬ 
tion of ASB 14 and AHT3 near the recently mapped location of 
LAM A3, ECA8 ql4-^ql5, supports the hypothesis that 
LAMA3 contains the causative mutation for El in American 
Saddlebred horses (Milenkovic et al., 2002). 

Two other microsatellites located on ECA8, AHT25 and 
C0R056, have normal distributions. AHT25 has been physi¬ 
cally located to ECA8p 17 —> p 16 and COR05 6 has been linkage 
mapped to the q arm of ECA8 (Swinburne et al., 2000a, 2000b). 
This indicates that they are distant from the El disease locus. 

It was recently reported that LAMC2 contains the causative 
mutation responsible for El within the Belgian horse breed 
(Spirito et al., 2002). El in the Belgian is phenotypically identi¬ 
cal to El in the American Saddlebred. The causative mutation 
was identified as a homozygous basepair insertion in LAMC2 
leading to a premature termination codon in affected Belgian 
horses. Our results do not support LAMC2 as the El disease 
locus in the American Saddlebred horse breed. Instead, we 
have mapped the disease locus to ECA8, the location of 
LAM A3. This suggests that the mutations, which cause the El 
phenotype in the Belgian and American Saddlebred breeds, 
arose independently. Pending confirmation by sequence analy¬ 
sis of LAMA3 this would make equine El the first reported 
hereditary disease in domestic animals with more than one cau¬ 
sative mutation. 
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Abstract. The PRKAG3 gene encodes a muscle-specific iso¬ 
form of the regulatory y subunit of AMP-activated protein 
kinase (AMPK). A major part of the coding PRKAG3 sequence 
was isolated from horse muscle cDNA using reverse-transcrip¬ 
tase (RT)-PCR analysis. Horse-specific primers were used to 
amplify genomic fragments containing 12 exons. Comparative 
sequence analysis of horse, pig, mouse, human, Fugu, and 
zebrafish was performed to establish the exon/intron organiza¬ 
tion of horse PRKAG3 and to study the homology among dif¬ 
ferent isoforms of AMPK y genes in vertebrates. The results 
showed conclusively that the three different isoforms (yl, y2, 
and y3) were established already in bony fishes. Seven single 
nucleotide polymorphisms (SNPs), five causing amino acid 
substitutions, were identified in a screening across horse breeds 
with widely different phenotypes as regards muscle develop¬ 
ment and intended performance. The screening of a major part 


of the PRKAG3 coding sequence in a small case/control mate¬ 
rial of horses affected with polysaccharide storage myopathy 
did not reveal any mutation that was exclusively associated 
with this muscle storage disease. The breed comparison re¬ 
vealed several potentially interesting SNPs. One of these 
(Pro258Leu) occurs at a residue that is highly conserved among 
AMPK y genes. In an SNP screening, the variant allele was only 
found in horse breeds that can be classified as heavy (Belgian) 
or moderately heavy (North Swedish Trotter, Fjord, and Swed¬ 
ish Warmblood) but not in light horse breeds selected for speed 
or racing performance (Standardbred, Thoroughbred, and 
Quarter horse) or in ponies (Icelandic horses and Shetland 
pony). The results will facilitate future studies of the possible 
functional significance of PRKAG3 polymorphisms in horses. 

Copyright©2003 S. Karger AG, Basel 


The AMP-activated protein kinase (AMPK) is a metabolic 
stress-sensing protein kinase that plays an important role in the 
regulation of energy homeostasis within the eukaryotic cell 
(Hardie et al., 1998; Kemp et al., 1999). The active enzyme is a 
heterotrimer composed of a catalytic a subunit, a (3 subunit and 
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a regulatory y subunit. In mammals, seven genes encode differ¬ 
ent isoforms of the different subunits (al, a2, (31, (32, yl, y2, 
and y3) and these may form 12 different heterotrimeric combi¬ 
nations. AMPK is allosterically activated by the increased 
AMP to ATP ratio that occurs during metabolic stress such as 
nutrient starvation and exercise (Hardie and Carling, 1997; 
Nielsen et al., 2003). Once activated, AMPK turns on ATP- 
producing pathways such as fatty acid oxidation and inhibits 
ATP-consuming pathways such as fatty acid synthesis, thereby 
conserving energy homeostasis (Muoio et al., 1999). AMPK is 
also activated by muscle contraction and this leads to transloca¬ 
tion of glucose transporter 4 (GLUT4) to the plasma mem¬ 
brane, increased glucose uptake, and increased glycogen syn¬ 
thesis (Holmes et al., 1999). 

Recently, several mutations with phenotype effects in skele¬ 
tal muscle or heart have been identified in the mammalian 
PRKAG2 and PRKAG3 genes encoding respectively the y2 
and y3 isoform of the regulatory subunit of AMPK. In mam- 
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ATG 


Exon 

Number 


2 | 3 A 5 6 7 8 9 10 11 12 13 

D-□—CZH-0—M-O-D-CHMl-Cl—C 

40 156 404 82 59 46 55 127 166 38 147 42 



FI (2.5-kb) 


Fig-1 . Schematic diagram of the equine PRKAG3 gene. The exon-intron structure is shown together with the PCR amplicons 
used for sequencing (F0-F5). Open boxes indicate exons, and lines connecting the boxes indicate introns. Numbers below the open 
boxes represent the size of exons. Large introns are depicted with a broken line. Black boxes within the depicted PCR amplicons 
(F0-F5) indicate the corresponding exons. Primer locations are indicated by arrows. Arrows in bold represent sequencing primers. 
*1 and *2 indicate that the FI forward and F5 reverse degenerated primers were used to amplify the PRKAG3 cDNA from horse 
muscle. 


Table 1. Primers used for PCR amplification and sequencing of the 
equine PRKAG3 gene 


Fragments 3 

Primer sequences b (5'—>3') 

Product 
size (kb) 

F0 

F, CACCATGGAGCCCGAGCTGGAGCA 

R, CCTGCTGCCCCTGCTCCCATCTC 

1.1 

FI 

F, AGCATCAAGAGATGAGCTTCCTAGAGCAAG 

R, CCCACGAAGCTCTGCTTCTT 

2.5 

F2 

F, CTTCTTTGCCCTGGTGGCCA 

R, GAGACCACAGGCTTGAAGCA 

1.0 

F3 

F, AGAGGAAGCAGGGGAAGGGTG 

R, TGACCACAGGCAGCGCAGAG 

0.7 

F4 

F, CTTCCTTTCCCGCACCATCC 

R, AAGCGAGAGTAGAGGCCCACGA 

1.2 

F5 

F, GGTGGTGGTGGAGGTGAAAGAG 

R, CCAGCAGGGCTGAGCACCAGTGCCTGAAGG 

1.4 


a Indicates PCR fragments depicted in Fig. 1. 
b Primer sequences used for both PCR and sequencing are in bold. 


mals, the y3isoform has been found to be primarily expressed 
in white (fast-twitch, type lib) skeletal muscle fibers, in which it 
is the predominant y isoform, suggesting a key role for 
PRKAG3 in this tissue (Milan et al., 2000; Mahlapuu et al., 
2003). The y2 isoform is predominantly expressed in the 
human heart but has a more broad tissue distribution in 
rodents (Milan et al., 2000; Mahlapuu et al., 2003). In the pig, a 
missense mutation Arg200Gln in PRKAG3 produces the dom¬ 
inant RN phenotype characterized by markedly increased gly¬ 
cogen storage in skeletal muscle and highly significant effects 
on meat quality (Milan et al., 2000). A second missense muta¬ 
tion in pig PRKAG3, Val224Ile, has been reported to have an 
opposite effect on meat quality compared with the Arg200Gln 
mutation (Ciobanu et al., 2001). Several mutations in the 
human PRKAG2 gene have been found to cause cardiomyopa¬ 
thies (Gollob et al., 2001; Flamilton et al., 2001). Interestingly a 
missense mutation (Arg302Gln), occurring at the correspond¬ 
ing position as the Arg200Gln mutation in pig PRKAG3, 


causes the Wolff-Parkinson-White cardiomyopathy in hu¬ 
mans. 

This investigation of the horse (Equus caballus) was ini¬ 
tiated for two reasons. Firstly, we wanted to investigate the pos¬ 
sibility that equine polysaccharide storage myopathy (PSSM) is 
caused by a PRKAG3 mutation. PSSM is an inherited myopa¬ 
thy in Quarter horses that resembles the porcine RN phenotype 
in that horses have a dramatically high muscle glycogen con¬ 
centration (Valberg et al., 1992). All glycogenolytic and glyco¬ 
lytic enzyme activities in PSSM muscle are similar to healthy 
horses (Valberg et al., 1998). The increased muscle glycogen in 
PSSM horses is associated with enhanced sensitivity of skeletal 
muscle to insulin (De La Corte et al., 1999). The mode of inher¬ 
itance for PSSM has not been firmly established but pedigree 
studies and limited breeding trials indicate a founder stallion 
and transmission of PSSM to offspring consistent with a reces¬ 
sive inheritance (Valberg et al., 1996; De La Corte et al., 2002). 
A second goal of this study was to determine if the strong selec¬ 
tion in draught horses and racing horses may have influenced 
the population frequency of PRKAG3 alleles in some breeds. 
The large increase in skeletal muscle glycogen content in the 
RN pig suggests that PRKAG3 mutations may affect muscle 
strength and endurance as it is very well established that glyco¬ 
gen content is an important factor for resistance to muscle 
fatigue. Here we report the sequencing and characterization of 
the equine PRKAG3 gene and a screening for functionally 
important PRKAG3 mutations among horse breeds with strik¬ 
ingly different muscle phenotypes. 

Materials and methods 

Animals and DNA isolation 

The study included three Quarter horses affected by polysaccharide stor¬ 
age myopathy (PSSM) and two clinically healthy controls from the same 
breed. These samples were obtained from horses diagnosed at the University 
of Minnesota with PSSM on the basis of abnormal polysaccharide in skeletal 
muscle biopsies stained with periodic acid Schiff (PAS). Samples from eight 
unrelated Standardbred stallions with outstanding racing performance and 
five unrelated Standardbred horses with average racing performance were 
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Table 2. PRKAG3 exon/intron organization 

in horse, human, pig, mouse, and zebrafish Exon Len Z th ‘ W _ Intron Len g th ‘ < b P> _ 

Horse Pig Human Mouse Zebrafish Horse Pig Human Mouse Zebrafish 


1 

n.a 

108 

33 

33 

n.a. 

1 

n.a. 

302 

362 

306 

n.a. 

2 

40 b 

40 

40 

40 

n.a. 

2 

>700 

478 

434 

447 

n.a. 

3 

156 

156 

156 

153 

n.a. 

3 

318 

360 

361 

313 

n.a. 

4 

404 

404 

404 

407 

108 

4 

>1500 

890 

1377 

1200 

216 

5 

82 

82 

82 

82 

82 

5 

>400 

460 

456 

114 

3225 

6 

59 

59 

59 

59 

59 

6 

122 

101 

125 

100 

2577 

7 

46 

46 

46 

46 

46 

7 

183 

216 

203 

179 

87 

8 

55 

55 

55 

55 

55 

8 

177 

201 

201 

468 

307 

9 

127 

127 

127 

127 

127 

9 

140 

132 

154 

99 

4342 

10 

166 

166 

166 

166 

166 

10 

>1500 

1127 

2349 

-1900 

97 

11 

38 

38 

38 

38 

38 

11 

>100 

175 

170 

168 

1247 

12 

147 

147 

147 

147 

147 

12 

355 

356 

341 

330 

96 

13 

n.a. 

117 

117 

117 

108 








a 

b 


n.a.: Not available. 

Estimated based on the size of the corresponding exon in other mammals. 


analysed. Furthermore, genomic DNA samples from horses previously sub¬ 
jected to paternity testing at the Blood Typing Laboratory in Uppsala, Swe¬ 
den, were used for a breed comparison. The sample included heavily muscled 
draught horses (Belgian), moderately muscled breeds (North Swedish Trot¬ 
ter, Fjord, and Swedish Warmblood), breeds strongly selected for racing per¬ 
formance (Thoroughbred and Standardbred) and ponies (Icelandic horse and 
Shetland pony). 

RT-PCR cloning and genomic sequencing 

Needle muscle biopsies were obtained from the gluteus medius muscles 
of three Quarter Horses affected with PSSM and two healthy controls. The 
tissue was frozen in liquid nitrogen immediately after biopsy. mRNA was 
isolated from the muscle biopsies using the Invitrogen Micro-FastTrack 2.0 
kit. cDNA was prepared using the Invitrogen Superscript II RT kit with ran¬ 
dom hexamers, followed by PCR with degenerated primers designed on the 
basis of human, mouse, and pig sequences to amplify an equine PRKAG3 
gene fragment covering 11 exons (Fig. 1 and Table 1). Amplification was 
conducted in 20-pl reactions each containing 30 ng cDNA, 0.2 mM dNTPs, 
1.5 mM MgCfe, 5 pmol of each primer, AmpliTaq Gold DNA polymerase, 
and reaction buffer (PE Applied Biosystems, Foster City, USA). The cycling 
conditions included an initial incubation at 94 °C for 5 min followed by 32 
cycles comprising 1 min at 94°C, 1 min at 55 °C, and 1 min at 72°C. PCR 
products were purified using the QIAquick PCR purification kit (Qiagen, 
Hilden, Germany) and directly sequenced using BigDye Terminator chemis¬ 
try (PE Applied Biosystems). Inter-exon PCR amplifications were performed 
using genomic DNA to determine the PRKAG3 genomic organization as 
indicated in Fig. 1. Annealing temperatures were in the range 50-65 ° C. The 
PCR products were sequenced directly, and the exon-intron boundaries were 
established by comparing genomic and cDNA sequences. The numbering of 
codons follows the one reported for pig PRKAG3 (Milan et al., 2000). The 
sequence data reported in this paper have been deposited in GenBank under 
accession numbers AY376689 and AY42371-AY423273. 

Bioinformatic characterization 

DNA and predicted protein sequences were analyzed using the Se- 
quencher program (version 3.0, Gene Codes Corp). The BLAST family of 
programs was used for database searches on the NCBI servers at http:// 
www.ncbi.nlm.nih.go/blast. Cross-species comparison between pig and horse 
genomic sequences was done with the Alfresco program (Jareborg and Dur¬ 
bin, 2000). The results were confirmed by a comparative genomic sequence 
analysis of horse, pig, mouse, human, and zebrafish sequences. For the phylo¬ 
genetic analysis, AMPK y nucleotide sequences corresponding to exons 4-13 
in the y3 gene, were retrieved from GenBank or the Ensembl databases. 
However, the zebrafish AMPK y2 sequence was retrieved from the TIGR 
zebrafish Gene Index database (http://www.tigr.org/tdb/tgi/zgi/). The Fugu 
AMPK y3 sequence was retrieved from the IMCB database (http://scrap- 
py.fugu-sg.org/Fugu_rubripes/) and only included eight exons (exons 6-13). 
The Fugu cDNA sequence lacked 38 bp encoded by exon 11 in mammalian 


PRKAG3 genes and apparently included a slightly longer sequence corre¬ 
sponding to exon 12. However, an analysis of the genomic sequence showed 
that exon 11 was present and well-conserved in the Fugu genome. We there¬ 
fore constructed a Fugu PRKAG3 transcript containing the exon 11 
sequence to facilitate the phylogenetic comparison with the mammalian 
homologues. Multiple sequence alignment was done with ClustalW (Thomp¬ 
son et al., 1994). A Neighbor-Joining phylogenetic tree, based on genetic dis¬ 
tances calculated with Kimura’s two-parameter method, was constructed 
using MEGA 2.1 (Kumar et al., 2001). 

SNP screening 

The single nucleotide polymorphism (SNP) at codon 258 (Pro258Leu) 
was genotyped by pyrosequencing, after PCR amplification of an 182-bp 
genomic fragment containing exon 8 using a biotinylated forward primer (5'- 
GAGGTGGGACAGTCTGGGGGCT) and a reverse primer (5'-ACTGA- 
AGGGCTGGGGAAGGGACT). The pyrosequencing reaction was carried 
out with an internal sequencing primer (S'-GGAGAGATGGAGACCAGA) 
according to the manufacturer’s recommendation (Pyrosequencing AB, 
Uppsala, Sweden). 

Results 

Characterization of the horse PRKAG3 gene and 

phylogenetic analysis 

One specific 1,281-bp RT-PCR product with an open read¬ 
ing frame encoding 427 amino acids was obtained using degen¬ 
erate primers. BLAST searches revealed that horse PRKAG3 
has an 87:83%, 88:83%, and 90:88% nucleotide:amino acid 
identity with the corresponding human, mouse, and pig 
PRKAG3 sequences, respectively. Four genomic fragments 
(F1-F4) containing 11 exons were amplified. A forward primer 
designed from the mouse PRKAG3 cDNA sequence was used 
to obtain an additional genomic fragment (F0) containing the 
putative translation start codon. The exon/intron organization 
was determined by comparing genomic and cDNA sequences 
(Fig. 1; Table 2). All splice sites corresponded to the 5' donor 
(GT) and 3' acceptor (AG) consensus sequences (Breathnach 
and Chambon, 1981). 

PRKAG3 homologues have previously been characterized 
in several mammalian species. A bioinformatic search revealed 
clear evidence for the presence of PRKAG3 homologues also in 
two fish species, the zebrafish (ENSDART00000000381) and 
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the pufferfish (Fugu; SINFRUT00000067047). The overall ge¬ 
nomic organization was well conserved among mammalian 
and fish species (Table 2). A multiple alignment including our 
partial horse PRKAG3 sequence and AMPK y nucleotide 
sequences from other species was used to construct a Neighbor- 
Joining phylogenetic tree (Fig. 2). The AMPK yl, y2, and y3 
isoforms formed distinct clusters and the data provided conclu¬ 
sive evidence that the horse PRKAG3 homologue has been 
identified in this study. 
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Fig. 2. Phylogenetic tree constructed for AMPK y nucleotide sequences 
corresponding to exon 6-13 in the mammalian PRKAG3 gene. Numbers at 
the nodes represent the bootstrap support values derived from 1,000 repli¬ 
cates. The scale indicates the genetic distance. The accession numbers for the 
sequences used are as follows: human (Hu) PRKAG1, NM_002733; mouse 
(Mu) PRKAG1, NM_016781; human PRKAG2, AF087875; mouse 
PRKAG2, NM_145401; human PRKAG3, AF214519; mouse PRKAG3, 
NM_153744; pig PRKAG3, AF214520; horse PRKAG3, AY376689; 
zebrafish (Da) PRKAG2, TC143275; zebrafish PRKAG3, ENS- 
DART00000000381; Fugu (Fu) PRKAG1, SINFRUT00000162627; 
Fugu PRKAG2, SINFRUT00000165144; Fugu PRKAG3, SINFRUT 
00000067047; Drosophila (Dro) PRKAG, AF094764. All sequences are 
from GenBank except the fish sequences that were obtained from the 
ENSEMBL and TIGR Gene Index databases. 


No obvious association between PRKAG3 polymorphism 
and polysaccharide storage myopathy (PSSM) 

The complete PRKAG3 coding sequence, except the first 13 
and the last 24 codons, was determined by RT-PCR analysis of 
muscle samples from three Quarter horses affected by PSSM 
and two non-affected controls from the same breed (Table 3). 
We detected four SNPs, two of which were non-synonymous 
substitutions, but none showed a complete association with 
PSSM. Since we have screened almost the entire coding 
sequence (428 out of 465 codons) and since the affected horses 
did not share any specific PRKAG3 haplotype we can conclude 
that it is highly unlikely that PSSM is caused by a single muta¬ 
tion affecting the PRKAG3 coding sequence. However, we can¬ 
not exclude the possibility that this disease is caused by a 
PRKAG3 mutation occurring in an untranslated region or in a 
regulatory element. 

Genetic variation at the PRKAG3 locus across divergent 
horse breeds 

To test the possibility that the strong directional selection 
for muscle strength or racing performance in certain breeds has 
influenced the PRKAG3 allele frequency distribution across 
breeds we decided to determine almost the entire coding 
sequence from a sample of horses representing different breeds. 
Initially we sequenced eight Standardbred stallions with out¬ 
standing racing performance and five controls with average rac¬ 
ing performance. Six SNPs were detected, four non-synony- 
mous and two synonymous substitutions (Table 4). There was 
an indication of an allele frequency difference between stal¬ 
lions and controls for the SNPs at codons 26 and 51, but the 
difference was not significant in this limited sample. The 
Asn362His mutation that was found in a single breeding stal¬ 
lion is also potentially interesting, since Asn362 is highly con¬ 
served among mammalian PRKAG3 sequences. 


Table 3. Comparison of PRKAG3 cDNA sequences of horses affected 
with polysaccharide storage myopathy (PSSM) and non-affected controls. A 
dash indicates identity to the master sequence 3 


Horse 

Codon (Exon) 




26(3) 

49(3) 

271(9) 

291(9) 


GGA 

G 

CCG 

P 

GCC 

A 

GGC 

G 

Affected horses 

1 

R-- 

-Y- 




R/G 

P/L 

- 

- 

2 

— 

-Y- 

— 

— 


- 

P/L 

- 

- 

3 

— 

— 

-A 

-T 


- 

- 

- 

- 

Non-affected horses 

4 






- 

- 

- 

- 

5 

R~ 

— 

— 

— 


R/G 

- 

- 

- 


a Nucleotide R=A/G, Y=C/T. 


Table 4. PRKAG3 nucleotide substitutions identified among 22 horses 
representing six breeds 3 and observed allele frequencies among breeding stal¬ 
lions (n = 8) and controls (n = 5) of Standardbred horses 


Exon/nucleotide 

position 15 

Nucleic acid 
change 

Amino acid 
substitution 

Allele frequency 0 

Breeding 

stallions 

Controls 

3/76 

G<->A 

G26R 

0.31 

0.10 

3 / 146 

C<h»T 

P49L 

0 

0.10 

3 / 151 

G<h»A 

E51K 

0.25 

0.10 

8/773 

C<->T 

P258L 

0 

0 

9/813 

C<->A 

- 

0.44 

0.60 

9/873 

T<->C 

- 

0.12 

0.10 

10/1084 

A«->C 

N362H 

0.06 

0 


a Standardbred, Thoroughbred, North Swedish Trotter, Belgian, Shetland pony, 
and Quarter horse. 

b Nucleotide numbers counted from the translation start codon located in exon 3. 
c The allele frequencies refer to the allele indicated as the variant allele. 
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We then sequenced the PRKAG3 gene from one horse from 
each of the following breeds: Belgian, North Swedish Trotter, 
Shetland pony, and Thoroughbred. The seven SNPs detected 
across breeds are given in Table 4. All SNPs were confirmed by 
sequencing both strands across the SNP and all SNPs except 
the one at exon 10, ntl084 were found in more than one indi¬ 
vidual. The non-synonymous substitution at codon 258 
(Pro258Leu) found in a Belgian and a North Swedish Trotter 
was particularly interesting. This substitution occurs at residue 
5 in the second cystathionine (3-synthase domain (CBS2) of the 
AMPK y3 chain. Proline at this residue is conserved among all 
mammalian AMPK y isoforms and also in a Drosophila homo- 
logue. We therefore decided to further investigate the allele fre¬ 
quency distribution of this SNP by genotyping 111 horses 
representing nine different breeds (Table 5). The SNP screen¬ 
ing revealed a putative association between the presence of the 
Leu258 allele and muscle development since it was only found 
in the heavy (Belgian) and moderately heavy (North-Swedish 
Trotter, Swedish Warmblood, and Fjord) horses but not among 
29 horses representing three light breeds strongly selected for 
speed and racing performance (Standardbred, Thoroughbred, 
and Quarter horse) nor among the pony breeds included in this 
study (Icelandic horse and Shetland pony). 

Discussion 

In this study, we have reported the cDNA and correspond¬ 
ing genomic sequence for the equine PRKAG3 gene. The RT- 
PCR analysis documented the expression of PRKAG3 in horse 
skeletal muscle as expected from previous studies in pig and 
human (Milan et al., 2000). A phylogenetic analysis including 
horse, pig, mouse, human, pufferfish, and zebrafish sequences 
showed conclusively that the horse AMPK y sequence reported 
in this study is a y3 homologue. The result suggested that the 
genes for the three different mammalian isoforms of the AMPK 
y chains evolved by gene duplications from a common ances¬ 
tral gene subsequent to the divergence from an invertebrate 
ancestor but before the divergence of bony fishes and mammals 
(Fig. 2). The PRKAG3 homologues in zebrafish and in puffer- 
fish identified in this study have not yet been correctly anno¬ 
tated but our interpretation of homology is strongly supported 
by the presence of conserved synteny involving for instance the 


KIAA0173 gene that is located in the near vicinity of PRKAG3 
in both mammals and the two fish species. 

We identified seven equine PRKAG3 SNPs, five of which 
cause amino acid substitutions. The limited number of horses 
screened in this study did not indicate a complete association 
between any SNP and equine polysaccharide storage myopathy 
that occurs in Quarter horses. Further studies are required to 
reveal the genetic basis for this disease. A major step forward 
would be to establish a map localization to exclude the majority 
of the large number of potential candidate genes, in addition to 
PRKAG3, that may influence glycogen content in skeletal mus¬ 
cle. 

The comparative sequence analysis across nine different 
horse breeds revealed several potentially interesting mutations. 
An SNP screening comprising 111 horses revealed the presence 
of the mutant allele (Leu258) in breeds that can be classified as 
heavily muscled (Belgian) or moderately heavy (North Swedish 
Trotter, Fjord, and Swedish Warmblood) but it was not found 
in breeds selected for speed or racing performance (Stan¬ 
dardbred, Thoroughbred, or Quarter horse) or in ponies (Ice¬ 
landic horse and Shetland pony). The fact that Pro258 is evolu- 
tionarily very well conserved implies that this may be a func¬ 
tional SNP. This study will facilitate future studies of possible 
associations between PRKAG3 polymorphism and muscle de¬ 
velopment/function in the horse. It will be of particular interest 
to measure glycogen contents in horses with different genotypes 
since all functionally important AMPK mutations detected so 
far are associated with altered glycogen content in skeletal mus¬ 
cle (PRKAG3 mutations) or in heart (PRKAG2 mutations). 
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Table 5. Frequency of the PRKAG3 Leu258 
allele among nine horse breeds 


Breed 

n 

Leu258 

Belgian 

21 

0.14 

North Swedish Trotter 

20 

0.27 

Fjord 

10 

0.20 

Swedish Warmblood 

10 

0.10 

Icelandic horse 

10 

0 

Shetland pony 

11 

0 

Thoroughbred 

11 

0 

Standardbred 

13 

0 

Quarter horse 

5 

0 
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Abstract. The genus Equus is unusual in that five of the ten 
extant species have documented centric fission (Robertsonian 
translocation) polymorphisms within their populations, name¬ 
ly E. hemionus onager , E. hemionus kulan, E. kiang , E. africa- 
nus somaliensis , and E. quagga burchelli. Here we report evi¬ 
dence that the polymorphism involves the same homologous 
chromosome segments in each species, and that these chromo¬ 
some segments have homology to human chromosome 4 
(HSA4). Bacterial artificial chromosome clones containing 
equine genes SMARCA5 (ECA2q21 homologue to HSA4q31. 


21) and UCHL1 (ECA3q22 homologue to HSA4pl3) were 
mapped to a single metacentric chromosome and two unpaired 
acrocentrics by FISH mapping for individuals possessing odd 
numbers of chromosomes. These data suggest that the poly¬ 
morphism is either ancient and conserved within the genus or 
has occurred recently and independently within each species. 
Since these species are separated by 1-3 million years of evolu¬ 
tion, this polymorphism is remarkable and worthy of further 
investigations. 

Copyright©2003 S. Karger AG, Basel 


The phylogenetic order Perissodactyla, or odd-toed ungu¬ 
lates, was very diverse and species-rich during the late Paleo- 
cene into the Eocene, but extinctions have reduced the order to 
three families, Tapiridae, Rhinocerotidae, and Equidae (No¬ 
wak, 1999). Equidae, once a worldwide and diverse family, is 
now composed of a single genus, Equus , with ten extant species 
(reviewed in Bowling and Ruvinsky, 2000). Equus first ap¬ 
peared in the fossil record 3.7 million years (MY) ago, and div¬ 
erged in as little as 1.7 MY to form four related groups: the 
horses, the true asses, the hemiones, and the zebras (reviewed 
in Bowling and Ruvinsky, 2000; Oakenfull et al., 2000). These 
extant equid species are listed in Table 1, along with their com¬ 
mon names and modal and polymorphic diploid chromosome 
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numbers. Recent studies of mitochondrial DNA sequences 
among equids allowed construction of a maximum likelihood 
tree suggesting clades for horses, true asses, hemiones, and 
zebras (Oakenfull et al., 2000). Despite the relatively recent 
divergence of equids, they have widely varying diploid chromo¬ 
some numbers, ranging from 2n = 66 in Przewalski’s wild horse 
(E. przewalskii , EPR) to 2n = 32 in Hartmann’s mountain zebra 
( E . zebra hartmannae , EZH). The Equidae are thought to have 
undergone rapid chromosome evolution concurrent with spe- 
ciation (Bush et al., 1977). Each of the equid species has a 
unique, modal number of chromosomes. However, polymor¬ 
phisms for chromosome number have been found among nor¬ 
mal, healthy members of the hemiones: the onager (E. hemio¬ 
nus onager , EHO) (Ryder, 1978), the kulan ( E . hemionus kulan , 
EHK) (Ryder, 1978), and the kiang ( E . kiang , EKI) (Ryder and 
Chemnick, 1990); as well as in the Somali wild ass (E. africanus 
somaliensis , EAF) (Houck et al., 1998) and BurcheH’s zebra 
(E. quagga burchelli , EQB) (Whitehouse et al., 1984) (Table 1). 

The karyotypes of individuals heterozygous for a centric fis¬ 
sion have an unpaired large metacentric chromosome and two 
unpaired acrocentric chromosomes. This study was initiated to 
determine if the chromosome number polymorphisms seen in 
E. hemionus onager , E. hemionus kulan , E. kiang , E. africanus 
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Table 1. Species nomenclature, common 
names, and chromosome numbers for Equus spe¬ 
cies (Nowak, 1999; Bowling and Ruvinsky, 2000) 


Species 

Species abbreviation (common names) 

Modal 

2N 

Observed 
polymorphic 2N 

Horses 

Equus przewalskii 

EPR (Przewalski’s wild horse, Mongolian wild horse) 

66 

— 

Equus caballus 

ECA (domestic horse) 

64 

— 

True asses 

Equus asinus 

EAS (donkey, ass, E. africanus [domestic]) 

62 

— 

Equus africanus somaliensis 

EAF (Somali wild ass) 

62 

63, 64 

Hemiones 

Equus hemionus onager 

EHO (onager, Persian wild ass) 

56 

55 

Equus hemionus kulan 

EHK (kulan, Transcaspian wild ass) 

54 

55 

Equus kiang 

EKI (kiang, Tibetan wild ass) 

52 

51 

Zebras 

Equus grevyi 

EGR (Grevy’s zebra) 

46 

— 

Equus quagga burchelli 

EQB (Burchell’s zebra, plains zebra) 

44 

45 

Equus zebra hartmannae 

EZH (Hartmann’s mountain zebra) 

32 

— 


somaliensis , and E. quagga burchelli involved homologous or 
nonhomologous chromosomes. Other equids have been gene 
mapped using fluorescently labeled DNA probes based on 
domestic horse sequences (Raudsepp et ah, 2002). Therefore, 
horse bacterial artificial chromosome (BAC) clones, each con¬ 
taining a horse gene previously mapped to a specific horse chro¬ 
mosome, were mapped by FISH to the chromosomes of poly¬ 
morphic individuals with an odd number of chromosomes, and 
to some non-polymorphic individuals. 

In connection with another study, and reported here, two 
horse BAC clones were mapped to the polymorphic chromo¬ 
somes of an onager with 2n = 55 (Myka, unpublished data). 
One of these horse BAC clones contained the SWl/SNF-relat- 
ed, matrix-associated actin dependent regulator of chromatin, 
subfamily A, member 5 gene, SMARCA5 (ECA2q21 homolo- 
gue of HSA4q31.21; Lear et al., 2001; Karolchik et al., 2003) 
and the other contained the ubiquitin carboxyl-terminal ester¬ 
ase LI gene, UCHL1 (ECA3q22 homologue of HSA4pl3; Lear 
et al., 2001; Karolchik et al., 2003). These same probes were 
used to investigate the chromosome polymorphisms in other 
equids and to determine whether or not they were homologous 
to those found in the onager. 


Materials and methods 

Chromosome preparations 

Metaphase chromosome spreads were prepared by the CRES cytogene¬ 
tics laboratory at the Zoological Society of San Diego. E. hemionus onager , 
E. hemionus kulan , E. kiang, E. africanus somaliensis , E. quagga burchelli , 
E. zebra hartmannae, E. grevyi, and E. przewalskii metaphase spreads were 
prepared from fibroblast cell cultures using standard methods as previously 
described (Kumamoto et al., 1996). Cells were harvested after exposure to 
colcemid (final concentration 0.025 pg/ml) for 0-105 min, and subsequently 
exposed to 0.067 M KC1 for 30 min prior to fixation in methanokacetic 
acid. 

Probes 

DNA was prepared from two equine BAC clones, obtained from Institut 
National de la Recherche Agronomique (INRA), Jouy-en-Josas, France. This 
BAC library has about 40,000 clones and a mean insert size of 110 kb with a 
1.5 genome equivalent (Godard et al., 1998), and the complemented INRA 
BAC library has a total of 108,288 clones, a mean insert size of 100 kb, and 


3.4 genome equivalent (Milenkovic et al., 2002). The SMARCA5 BAC 
(INRA281E7) was previously mapped to ECA2q21 and the UCHL1 BAC 
(INRA208G12) was previously mapped to ECA3q22 (Lear et al., 2001). In 
the human genome, SMARCA5 maps to HSA4q31.21 and UCHL1 maps to 
HSA4p 13 (Karolchik et al., 2003). 

FISH mapping and analysis 

DNA labeling and FISH was performed as previously described (Lear et 
al., 2001). Briefly, 1 pg of BAC DNA for each probe was nick translated and 
labeled either with biotin-14-dATP (Life Technologies) or with digoxigenin- 
11-dUTP (Roche) according to manufacturers’ directions. Chromosomes 
were counterstained with either 50 ng/ml DAPI/Antifade (Ventana) or 31.5 
ng/ml DAPI III/Antifade (Vysis, Inc.), and slides were stored at -20°C. 
Hybridization results were examined and analyzed using an Axioplan 2 fluo¬ 
rescent microscope (Zeiss) equipped with Cytovision @ /Genus™ Application 
Software Version 2.7 (Applied Imaging). 

Results and discussion 

BACs containing the genes SMARCA5 and UCHL1 hybrid¬ 
ized to three different chromosomes in equids known to exhibit 
the polymorphism: E. hemionus onager with 2n = 55, E. kiang 
with 2n = 51, E. hemionus kulan with 2n = 55, E. africanus 
somaliensis with 2n = 63, and E. quagga burchelli with 2n = 45 
(Fig. 1 A, B, C, D, E, respectively). SMARCA5 hybridized to the 
p arm of a single metacentric and its acrocentric homologue 
while UCHL1 hybridized to the q arm of the same metacentric 
and its acrocentric homologue. The position of each probe on 
the metacentric chromosome appeared to correspond to a simi¬ 
lar position on the acrocentric chromosome. In addition, 
SMARCA5 and UCHL1 were FISH mapped in E. przewalskii 
(EPR), E. grevyi (EGR) and E. zebra hartmannae (EZH) (data 
not shown). In EPR, SMARCA5 and UCHL1 hybridized to the 
q arms of two metacentric chromosome pairs, while in both 
EGR and EZH, SMARCA5 and UCHL1 hybridized to oppo¬ 
site arms of one metacentric pair (Myka et al., this volume). 

The results described above (Fig. 1) demonstrate that ho¬ 
mologous chromosomes were involved in the chromosome 
polymorphism of five extant equid species. The homology of a 
few gene loci certainly does not prove homology of entire chro¬ 
mosome arms. However, the combination of chromosome 
painting, banding patterns, and other mapping information 
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Fig. 1. Homologous chromosome polymor¬ 
phism in (A) E. hemionus onager, 2n = 55, (B) E. 
kiang, 2n = 51, (C) E. hemionus kulan, 2n = 55, 
(D) E. africanus somaliensis, 2n = 63, and (E) E. 
quagga burchelli, 2n = 45. The SMARCA5 probe 
is visualized with FITC (green) and the UCHL1 
probe is visualized with rhodamine red-X (red). 



suggests that these chromosome arms are homologous. Addi¬ 
tionally, the results for EPR (the two probes mapping to differ¬ 
ent arms of two metacentric chromosome pairs) were homolo¬ 
gous to that found in the horse (Myka et al., this volume). The 
results for EGR and EZH (two probes mapped to opposite arms 
of a single pair of metacentric chromosomes) were homologous 
to that found in E. asinus (EAS) by both cross-species painting 
and FISH mapping (Raudsepp et al., 1999; Raudsepp et al., 
2001) and for human chromosome paints to EZH (Richard et 
al., 2001). 

The chromosome configuration of the HSA4 homologue is 
known for all extant equid species, with the addition of the 
results in this report. Raudsepp et al. (1996) demonstrated that 
HSA4 in domestic horses is split between ECA2q and ECA3q, 
as shown for EPR (Myka et al., this volume). Meanwhile, HSA4 
homologous DNA is present as a single metacentric chromo¬ 
some in EAS (Raudsepp and Chowdhary, 1999), EZH (Richard 
et al., 2001) and EGR (this study). HSA4 homologues have 
been conserved as single chromosomes or in large segments in 
many species. For example, large portions of HSA4 have been 
conserved on chromosome 4 of the domestic chicken (Gallus 
gallus domesticus) (Chowdhary and Raudsepp, 2000), and on 
the domestic pig (Sus scrofa) chromosome 8 (Larsen et al., 

1999) . However, HSA4 homologous DNA is divided among 
BTA6, BTA17, and BTA27 in domestic cattle (Fisher et al., 
1997; Sonstegard et al., 2000). 

Balanced chromosome polymorphisms are relatively un¬ 
common but have been described in other species. A common 
example of a balanced chromosome polymorphism is the 1;29 
Robertsonian translocation found in domestic cattle (Bos tau¬ 
nts , BTA). In some individuals, the smallest chromosome, 
BTA29, has fused with the largest chromosome, BTA1 (Gus- 
tavsson and Rockborn, 1964). Daughters of bulls with 
rob(l;29) experience lowered fertility, but the polymorphism 
persists in some herds of domestic cattle (Weber et al., 1989). 
Balanced chromosome polymorphisms have been identified in 
other mammalian species as well, such as oryx (Oryx dammah 
and O. leucocryx) (Kumamoto et al., 1999), gazelle (Gazella 
subgutturosa marica , G. bennetti , and G. saudiya) (Vassart et 
al., 1993; Kumamoto et al., 1995), the rock wallaby (Petrogale 
lateralis pearsoni) (Eldridge and Pearson, 1997), domestic 
sheep (Ovis aries) (Koop et al., 1983), and the owl monkey (Ao- 
tus) (Ma et al., 1976), but in all cases are relatively uncommon. 
One out of four species in the genus Kobus (K. ellipsiprymnus) 
exhibited two polymorphic centric fusions (Kingswood et al., 

2000) . Interestingly, a balanced chromosomal polymorphism 
has been reported in another Perissodactyl, the northern white 


rhinoceros (Ceratotherium simum) (Houck et al., 1994), and 
research is underway to determine if the rhinoceros polymor¬ 
phism is homologous to the equids (Lear, personal communica¬ 
tion). However, cytogenetic studies among members of phylo¬ 
genetic orders and families have been limited, so it is not possi¬ 
ble to say whether this situation among the Equidae and other 
Perissodactyls might be unique. 

The discovery of the same chromosome polymorphism in 
five closely related equid species separated by as many as 3 MY 
of evolution is remarkable. The polymorphisms could be the 
result(s) of fission of a single metacentric chromosome resulting 
in two acrocentric chromosomes or of fusion of two acrocentric 
chromosomes forming a single metacentric chromosome. The 
two possible events are depicted in Figs. 2a and b. Parsimony 
favors fission since the metacentric configuration (HSA4 ho¬ 
mologue) may be typical of the ancestral karyotype (Chowdha¬ 
ry et al., 1998; Murphy et al., 2001; Yang et al., 2003). With 
respect to fission, two opposing hypotheses may account for the 
existence of these polymorphic chromosomes: 1) a single ances¬ 
tral fission or 2) multiple, independent fissions. The ancestral 
fission hypothesis, illustrated in Fig. 2a, suggests that the poly¬ 
morphism occurred once in an ancestral equid species. Essen¬ 
tially, one metacentric chromosome from a pair homologous to 
HSA4 in the ancestral equid could have undergone a fission 
event, resulting in the polymorphism seen as a single metacent¬ 
ric and two acrocentric chromosomes. Also, this hypothesis 
suggests that the polymorphism would have been maintained 
throughout speciation of EHO, EHK, EKI, EQB, and EAF, and 
that these extant species carry the legacy of the ancestral fission 
event. EAS, EGR, and EZH have metacentric pairs of chromo¬ 
somes homologous to the ancestral HSA4 homologue. Finally, 
before the speciation events leading to ECA and EPR, a fusion 
event could have occurred resulting in the current situation in 
the horses, namely that the HSA4 homologous arms are found 
in two separate chromosomes, ECA2 and ECA3. 

The independent fission hypothesis would have involved 
multiple, and possibly as many as 5, independent fission events 
in the extant equid species or their ancestors. Furthermore, 
independent fissions of this chromosome would suggest that 
some characteristic of the HSA4 homologue in the equids ren¬ 
ders it susceptible to fissioning. This hypothesis is supported by 
the occurrence of de novo fissions of the HSA4 homologue 
found in a donkey foal (EAS) (Bowling and Millon, 1988) and 
in a Somali wild ass (Houck et al., 1998). 

Determining which of these historical events occurred may 
be difficult. Studies of DNA sequences in mitochondria are 
useful to suggest a sequence of events and times of divergence 


Cytogenet Genome Res 102:217-221 (2003) 


219 









Equid Ancestral Chromosome 
(HSA4 homologue) 



Fission event 



Polymorphism 

maintained 





EHO, EHK, EKI, 
EQB. EAF 


Model for ancestral fission event hypothesis 


Fig. 2. Fission/fusion hypothesis models, 
(a) Model for ancestral fission event hypothesis. 
In this model, the polymorphism arose by a fis¬ 
sion event in the HSA4 homologue in the ances¬ 
tral equid species. Subsequently, the polymor¬ 
phism was fixed and maintained in EHO, EHK, 
EKI, EQB, and EAF. However, the fission event 
was followed by a fusion event with different 
chromosome segments, leading to the configura¬ 
tion in ECA and EPR with the HSA4 homologous 
DNA found in two different arms in two meta- 
centric chromosome pairs. The metacentric con¬ 
dition seen in EAS, EGR, and EZH may represent 
the ancestral condition prior to the fission event, 
or a fixation of the metacentric chromosome fol¬ 
lowing the fission event, (b) Model for ancestral 
fusion event hypothesis. In this model, the poly¬ 
morphism arose by a fusion event which fused 
two acrocentric chromosomes with homology to 
HSA4 in the ancestral equid species. Subsequent¬ 
ly, the polymorphism was fixed and maintained 
in EHO, EHK, EKI, EQB, and EAF. However, 
the fusion event was followed by a second fusion 
event with different chromosome segments, lead¬ 
ing to the configuration in ECA and EPR with the 
HSA4 homologous DNA found in two different 
arms in two metacentric chromosome pairs. The 
metacentric condition seen in EAS, EGR, and 
EZH may represent the fixation of the metacent¬ 
ric chromosome following the fusion event. 
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for the different equid species (Oakenfull et al., 2000). How¬ 
ever, chromosomal genes can participate in genetic recombina¬ 
tion which destroys haplotype associations. Bailey et al. (2002) 
reported that chromosome rearrangements associated with the 
evolution of primates resulted in segmental duplications. If the 
fusions or fissions in equid evolution produced similar complex 
features then discovery of these features may suggest which 
chain of events occurred. 
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Abstract. Przewalski’s wild horse (E. przewalskii , EPR) has 
a diploid chromosome number of 2n = 66 while the domestic 
horse ( E . caballus , ECA) has a diploid chromosome number of 
2n = 64. Discussions about their phylogenetic relationship and 
taxonomic classification have hinged on comparisons of their 
skeletal morphology, protein and mitochondrial DNA similari¬ 
ties, their ability to produce fertile hybrid offspring, and on 
comparison of their chromosome morphology and banding pat¬ 
terns. Previous studies of GTG-banded karyotypes suggested 
that the chromosomes of both equids were homologous and the 
difference in chromosome number was due to a Robertsonian 
event involving two pairs of acrocentric chromosomes in EPR 


and one pair of metacentric chromosomes in ECA (ECA5). To 
determine which EPR chromosomes were homologous to 
ECA5 and to confirm the predicted chromosome homologies 
based on GTG banding, we constructed a comparative gene 
map between ECA and EPR by FISH mapping 46 domestic 
horse-derived BAC clones containing genes previously mapped 
to ECA chromosomes. The results indicated that all ECA and 
EPR chromosomes were homologous as predicted by GTG 
banding, but provide new information in that the EPR acro¬ 
centric chromosomes EPR23 and EPR24 were shown to be 
homologues of the ECA metacentric chromosome ECA5. 

Copyright©2003 S. Karger AG, Basel 


Przewalski’s wild horse (.Equus przewalskii , EPR) is the only 
extant wild horse and historically lived in an area that is now 
comprised of sections of Mongolia, Khazakstan, and the Xin- 
jiang-Uygur Autonomous Region of China (Ryder, 1993). All 
living Przewalski’s wild horses are descendants of 13 individu¬ 
als (Ryder, 1994) and are now found only in captive settings 
such as zoos and where reintroduced into wildlife preserves. 

A close relationship between domestic horses {Equus cabal¬ 
lus , ECA) and EPR has been shown by many researchers. Skull 
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measurements do not distinguish between the two species (Ei- 
senmann and Baylac, 2000), while other skeletal features are 
distinct (Sasaki et al., 1999). Protein polymorphism studies 
support the close relationship (Kaminski, 1979; Lowenstein 
and Ryder, 1985; Bowling and Ryder, 1987), as do molecular 
DNA studies (Oakenfull and Clegg, 1998) and amino acid 
sequences (Pirhonen et al., 2002). Studies of mitochondrial 
DNA and 12S ribosomal RNA gene sequences show little or no 
differences between ECA and EPR (George and Ryder, 1986; 
Ishida et al., 1995; Oakenfull and Ryder, 1998; Oakenfull et al., 
2000; Jansen et al., 2002). Additionally, domestic horse/Prze- 
walski’s horse hybrids are viable and can produce offspring 
(Short et al., 1974), while hybrids of domestic horses with other 
equids are usually viable but almost always infertile. 

Analyses of chromosome number and morphology are of 
use in characterizing and defining species. EPR has a diploid 
chromosome number of 2n = 66, in contrast to 2n = 64 in ECA 
(Benirschke et al., 1965; Benirschke and Malouf, 1967). Exami¬ 
nation of the karyotypes of EPR and ECA revealed that the 
difference in diploid chromosome number could be explained 
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by EPR containing two additional pairs of acrocentric chromo¬ 
somes and one less metacentric chromosome pair than ECA, 
with a Robertsonian fusion suspected in ECA (Ryder et ah, 
1978). Ryder suggested that the metacentric chromosome pair 
ECA5 was homologous to two pairs of acrocentric chromo¬ 
somes in EPR (Ryder et ah, 1978). 

This study was initiated to a) specifically determine if ECA5 
homologues were involved in the Robertsonian rearrangements 
associated with the two equids, and b) further investigate 
homology between EPR and ECA chromosomes by fluores¬ 
cence in situ hybridization (FISH) mapping. Large insert 
equine probes have been successfully used to identify horse 
chromosome homology with donkey chromosomes (Raudsepp 
et al., 2001). Therefore, this approach was selected for compar¬ 
ative mapping since the domestic horse and Przewalski’s wild 
horse are closely related. 


Materials and methods 

Chromosome preparations 

Metaphase chromosome spreads were prepared by the CRES laboratory 
at the Zoological Society of San Diego. Fibroblast cell lines of EPR accession 
numbers KB7413 and KB 12925, from the Frozen Zoo®, were used to pre¬ 
pare metaphase spreads as previously described (Kumamoto et al., 1996). 
Briefly, cells were harvested after exposure to colcemid (final concentration 
0.025 pg/ml) for 105 min, and subsequently exposed to 0.067 M KC1 for 
30 min prior to fixation in methanohacetic acid. 

Probes 

DNA was prepared from horse bacterial artificial chromosome (BAC) 
clones, obtained from Institut National de la Recherche Agronomique 
(INRA) (Godard et al., 1998; Milenkovic et al., 2002) and the USDA CHO- 
RI-241 Equine BAC library (http://www.chori.org/bacpac/equine241.htm). 
Forty-six domestic horse-derived BAC clones, previously mapped to ECA, 
were selected from 38 of 44 autosomal chromosome arms in the ECA karyo¬ 
type plus ECAX (Table 1). Of the total loci mapped, 44 were specific equine 
genes, one contained equine DNA in the form of an anonymous BAC, and 
one was an expressed sequence tag (EST). 

FISH mapping and analysis 

DNA labeling and FISH was performed as previously described (Lear et 
al., 2001). 


Results 

All 46 horse BACs hybridized to EPR chromosomes. The 46 
BACs included at least one probe from 38 of the 44 ECA auto¬ 
somal chromosome arms and both arms of ECAX. A summary 
of BAC localizations in ECA, EPR and human genomes can be 
found in Table 1. 

Horse BAC clones containing the genes DIA1 (ECA5ql7), 
LAMC2 (ECA5pl7-pl6), LAMB 3 (ECA5pl5), UOX 
(ECA5ql5-ql6), VCAM1 (ECA5ql4), and VDUP1 
(ECA5pl2) were FISH mapped to Przewalski’s horse chromo¬ 
somes (Fig. lc). BAC probes containing genes from ECA5p and 
ECA5q hybridized to two separate Przewalski’s horse acrocent¬ 
ric chromosome pairs, EPR23 and EPR24, respectively. For 
example, VDUP1 and YCAM1 identified two separate acro¬ 
centric chromosome pairs (Fig. la). The identification of the 


ECA5 homologues as EPR23 and EPR24 was based on GTG- 
banding patterns (Fig. lb). No other rearrangements were 
found. With the exception of the differences involving ECA5 
and its homologues EPR23 and EPR24, the distribution and 
order of the genes used in this study appeared to be the same for 
both species. Each ECA chromosome has one EPR homologue, 
with the exception of ECA5, which was shown to have two 
homologues, as described above. 

Discussion 

Based on mitochondrial DNA sequence diversity, domestic 
horses and Przewalski’s wild horses are thought to have di¬ 
verged from a common ancestor within the past 500,000 to 
1 million years (Ishida et al., 1995; Oakenfull et al., 2000, 
respectively). Indeed, the karyotypes of these two species ap¬ 
pear very similar and the hypothesis was advanced that they 
differ only by a single Robertsonian translocation appearing as 
a metacentric chromosome in ECA and two small acrocentric 
chromosomes in EPR (Ryder et al., 1978). Here we demon¬ 
strate that the genetic material from the metacentric ECA5 is 
located on two acrocentric chromosome pairs in EPR, EPR23 
and EPR24. While a single marker does not prove homology 
between entire chromosome arms, this interpretation is consis¬ 
tent with chromosome banding patterns, size and morphology 
of the chromosomes involved. 

These data do not distinguish between a fusion of ancestral 
acrocentric chromosomes to form ECA5 or a fission of the 
ancestral ECA5 homologue to create EPR23 and EPR 24. All 
the genomic material on ECA5 is derivative from HSA1 
homologous DNA. Proposed ancestral mammalian karyotypes 
suggest that the majority of HSA1 homologous genetic material 
was originally found on one ancestral mammalian chromosome 
(Murphy et al., 2001; Yang et al., 2003). Consequently, while 
fusion or fission may equally explain the differences between 
these two horse karyotypes, the most parsimonious explanation 
for this phenomenon favors the fission of an ancestral equid 
chromosome containing HSA1 homologous genomic material 
to yield two acrocentrics ancestral to EPR23 and EPR24. 

However, parsimony does not constitute proof and to 
resolve this question more comparative gene mapping needs to 
be conducted. The argument of parsimony assumes that fusion 
of chromosomes occurs at random and that random chance 
does not favor the same fusions of homologous acrocentric 
chromosomes in multiple species. The situation for equids with 
regard to HSA1 homologous DNA is complicated by two obser¬ 
vations. First, at least three horse chromosomes, ECA2, ECA5, 
ECA30 show homology to HSA1 genes (Raudsepp et al., 1996); 
Second, the gene order on ECA5 indicates multiple rearrange¬ 
ments relative to the human gene order (Milenkovic et al., 
2002). Indeed, neither configuration may represent an ances¬ 
tral phenotype and both configurations may be derivative 
through multiple chromosome rearrangements. 

This study did not identify any other exceptions to chromo¬ 
some homology between Przewalski’s horse and domestic 
horse. The results are consistent with the hypothesis that a very 
close phylogenetic relationship exists between the two species. 
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Fig. 1. (a) BAC clones containing VDUP1 (ECA5pl2) and VCAM1 (ECA5ql4) hybridized to E. przewalskii chromosomes. 
VDUP1 (EPR23) was visualized with FITC, and VCAM1 (EPR24) was visualized with Rhodamine Red-X. Chromosomes were 
counterstained with DAPI. (b) EPR23 and EPR24 are the ECA5 homologues. EPR23 and EPR24 are arranged next to ECA5, 
illustrating the similarities in the GTG-banding patterns, (c) Schematic presentation of ECA5 marker locations on ECA5, EPR23, 
and EPR24. 


Table 1. List of FISH mapped markers with their chromosome location in EPR, ECA, and human {Homo sapiens ; HSA). 
Equine map locations with references represent previously published mapping data. Human map locations for corresponding genes 
were retrieved from (http://www.ncbi.nlm.nih.gov). A question mark (?) indicates map position unknown. 


Symbol 

Locus name 

Chromosome location in 

EPR ECA 

HSA 

A4 

Anonymous BAC 

lp 

lp (Lear, unpublished data) 

? 

• 

FES 

v-fes feline sarcoma viral oncogene homolog 

lq 

lq (Lear et al., 2000) 

15q26.1 

PKM 

Pyruvate kinase muscle type 2 (PKM2) 

lq 

1 q21 (Lear et ah, 2000) 

15q22 

ALPL 

Alkaline phosphatase, liver/bone/kidney 

2p 

2pl4 (Mariat et ah, 2001) 

Ip36.1-p34 

SMARCA5 

SWI/SNF related, matrix associated, actin dependent 
regulator of chromatin, subfamily a, member 5 

2q 

2q21 (Lear et ah, 2001) 

4q31.1-q31.2 

GLG1 

Golgi apparatus protein 1 

3p 

3p 13-p 12 (Lear et ah, 2001) 

16q22-q23 

UCHL1 

Ubiquitin carboxyl-terminal esterase LI 

3q 

3q22 (Lear et ah, 2001) 

4pl4 

TCRG 

T cell receptor gamma 

4p 

4p 15-p 14 (Lear et ah, 2001) 

7p15-p14 

EN2 

Engrailed homolog 2 

4q 

4q27 (Lear et ah, 2001) 

7q36 

VDUP1 

Vitamin D up-regulated protein 1 

23 

5p 12 (Lear et ah, 2001) 

1 

LAMB3 

Laminin, beta 3 (nicein, kalinin) 

23 

5p 15 (Mariat et ah, 2001) 

lq32 

LAMC2 

Laminin gamma 2 chain 

23 

5pl7-p 16 (Mariat et ah, 2001) 

Iq25-q31 

VC AMI 

Vascular cell adhesion molecule 1 

24 

5q 14 (Lear et ah, 2001) 

1p32-p31 

uox 

Urate oxidase 

24 

5q 15-q 16 (Godard et ah, 2000) 

lp22 

DIA1 

Diaphorase 

24 

5q 17 (Mariat et ah, 2001) 

22q 13.2-q 13.31 

INHA 

Inhibin, alpha subunit 

5p 

6pl4 (Mariat et ah, 2001) 

2q33-q36 

KRAS2 

v-Ki-ras2 Kirsten rat sarcoma 2 viral oncogene homolog 

5q 

6q21 (Lear, unpublished data) 

12p 12.1 

LDHA 

Lactate dehydrogenase A 

8p 

7pl4.1-pl3(Milenkovic et ah, 2002) 

1 lp 15.4 

LYVE-1 

Lymphatic vessel endothelial hyaluronen receptor 1 

8q 

7q 16-q 18 (Chowdhary et ah, 2003) 

11 

SART3 

Squamous cell carcinoma antigen recognized by T cells 3 

6p 

8p 16-p 15 (Lear et ah, 2001) 

12q24.1 

TYMS 

Thymidylate synthase 

6q 

8q 12 (Lear et ah, 2000) 

18p 11.32 

SLC7A10 

Solute carrier family 7, member 10 

7p 

1 Op 15 (Hanzawa et ah, 2002) 

19q 13.1 

AMD1 

s-Adenosylmethionine decarboxylase 1 

7q 

10q21 (Learet ah, 2001) 

6q21-q22 

DDX5 

DEAD (Asp-Glu-Ala-Asp) box polypeptide 5 

10p 

11 p 13 (Learet ah, 2001) 

17q23-q25 

GH 

Growth hormone 

10p 

11 p 13 (Lear, unpublished data) 

17q22-q24 

CHRM1 

Acetylcholine receptor, muscarinic 1 

i iq 

12q 14 (Milenkovic et ah, 2002) 

11 q 13 

POR 

P-450 (cytochrome) oxidoreductase 

12p 

13p 13 (Milenkovic et ah, 2002) 

7q 11.2 

PRM1 

Protamine 1 

12q 

13q 14-q 16 (Lindgren et ah, 2001) 

16p 13.2 

LOX 

Lysyl oxidase 

13 

14q22 (Learet ah, 2001) 

5q23-q31 

Septin 2-like 

Septin 2-like cell division control protein 

14 

15q 12 (Lear, unpublished data) 

? 

• 

GLB1 

Galactosidase, beta-1 

15 

16q22 (Lear, unpublished data) 

3p21.33 

ALOX5AP 

Arachidonate 5-lipoxygenase-activating protein 

16 

17q 14-q 15 (Mariat et ah, 2001) 

3q 12 

CHRNA 

Cholinergic receptor, nicotinic, alpha 

17 

18q24-q25 (Lear, unpublished data) 

2q24-q32 

PROS1 

Protein S (alpha) 

19 

19q21 (Milenkovic et ah, 2002) 

3pl 1-ql 1.2 

MUT 

Methylmalonyl CoA mutase 

18 

20q21 (Lear et ah, 2001) 

6p21 

GZMA 

Granzyme A (granzyme 1, cytotoxic T-lymphocyte- 
associated serine esterase 3) 

20 

21ql3-ql4 (Chowdhary et ah, 2003) 

5q 11 -q 12 

RPN2 

Ribophorin II 

21 

22ql7 (Chowdhary et ah, 2003) 

5ql 1 -q 12 

IFNB1 

Interferon, beta 1, fibroblast 

22 

23q 16-q 17 (Lear et ah, 2001) 

9p21 

GGTA1 

Glycoprotein, alpha-galactosyltransferase 1 

26 

25q 17-q 18 (Milenkovic et ah, 2002) 

9q33-q34 

SOD1 

Superoxide dismutase 1 

27 

26q 15 (Godard et ah, 2000) 

21q22.1 

KITLG 

KIT ligand 

29 

28q 13 (Terry et ah, 2002) 

3pl4.1-pl2.3 

HESTG05 

EST 

30 

29qter (Godard et ah, 2000) 

? 

• 

TGFB2 

Transforming growth factor, beta 2 

31 

30ql4 (Milenkovic et ah, 2002), 

6q21 (Lear, unpublished data) 

1 q41 

PLG 

Plasminogen 

32 

31 q 12-q 14 (Lear et ah, 2000) 

6q26 

TRAP 170 

Thyroid hormone receptor associated protein complex 
component 

Xp 

Xpl5-pl4 (Raudsepp et ah, 2002) 

Xpl 1.4-pl 1.2 

PGK 

Phosphoglycerate kinase 1 (PGK1) 

Xq 

Xql3-ql4 (Milenkovic et ah, 2002) 

Xql3.3 
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However, the resolution of FISH mapping a single marker to 
each chromosome arm will not necessarily lead to the identifi¬ 
cation of intrachromosomal inversions or small translocations. 
It is possible that other rearrangements exist that would identi¬ 
fy differences in genome organization. Rearrangements not 
detected by our low density comparative map might be ob¬ 
served by increasing the density of domestic horse markers on 
the Przewalski horse chromosomes. Studying the synaptone- 
mal complexes of ECA/EPR hybrids might identify putative 
chromosomal inversions, following the approach of Switonski 
and Stranzinger (1998). However, these species are closely 
related and it is possible that no inversions exist. Consequently, 


characterization of these two horses as different species may 
revolve about the differences in repetitive elements found 
between the two types of horses (Wichman et ah, 1991). 
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Genetic variation in Przewalski's horses, with 
special focus on the last wild caught mare, 
231 Orlitza III 
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Abstract. In our continuing efforts to document genetic 
diversity in Przewalski’s horses and relatedness with domestic 
horses, we report genetic variation at 22 loci of blood group and 
protein polymorphisms and 29 loci of DNA (microsatellite) 
polymorphisms. The loci have been assigned by linkage or syn- 
teny mapping to 20 autosomes and the X chromosome of the 
domestic horse (plus four loci unassigned to a chromosome). 
With cumulative data from tests of 568 Przewalski’s horses 
using blood, hair or tooth samples, no species-defining markers 
were identified, however a few markers were present in the wild 
species but not in domestic horses. Inheritance patterns and 
linkage relationships reported in domestic horses appeared to 
be conserved in Przewalski’s horses. A derived type for the last 
wild caught mare 231 Orlitza III provided evidence for markers 
apparently not found in (or not currently available by descent 
from) the other species founders that were captured at the end 
of the nineteenth century. This information has been critical to 


the development of parentage analyses in the studbook popula¬ 
tion of Przewalski’s horses at Askania Nova, at one time the 
largest herd of captive animals and the source of stock for rein¬ 
troduction efforts. Some horses in the study showed genetic 
incompatibilities with their sire or dam, contradicting pub¬ 
lished studbook information. In many cases alternative paren¬ 
tage could be assigned from living animals. To assist in identifi¬ 
cation of correct parentage, DNA marker types for deceased 
horses were established from archived materials (teeth) or 
derived from offspring. Genetic markers were present in pedi¬ 
greed animals whose origin could not be accounted for from 
founders. Genetic distance analysis of erythrocyte protein, elec¬ 
trophoretic and microsatellite markers in Przewlaski’s horses 
and ten breeds of domestic horse place the Przewalski’s horse as 
an outgroup to domestic horses, introgression events from 
domestic horses not withstanding. 

Copyright©2003 S. Karger AG, Basel 


Genetic studies of Przewalski’s horse (Equus ferus przewal- 
skii) have principally been motivated by two purposes: 1) docu¬ 
menting the differences between Przewalski’s horse (PH) and 
the closely related and interfertile species, the domestic horse 
(E. caballus) (DH) and 2) making breeding management deci- 
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sions for PHs, extinct in the wild and increasingly faced with 
inbreeding concerns in captive populations. Previous extensive 
genetic studies with blood groups and protein polymorphisms 
showed a considerable amount of variation within the species 
(e.g., Scott, 1979; Ryder etal., 1979, 1981,1984; Putt and Whi- 
tehouse, 1983; Bowling and Ryder, 1987; Bowling and Dilean- 
is, 1990; Patterson et al., 1990; Bowling, 1992; Bowling et al., 
1992). A limited microsatellite survey provided additional evi¬ 
dence of polymorphism (Breen et al., 1994). Most genetic 
markers were shared with domestic horses, but markers appar¬ 
ently unique to PH were present. Recently, PH mitochondrial 
DNA (mtDNA) haplotypes have been documented through D- 
loop sequencing (Oakenfull and Ryder, 1998). Among repre¬ 
sentatives of the four extant female lines, only two haplotypes 
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were present, one in three lines and the other in the remaining 
line, neither reported in DHs. 

Recorded pedigrees (e.g., Volf and Kus, 1991; Kus, 1997) 
help provide information for breeding management decisions, 
but inbreeding can be minimized only if the pedigrees are cor¬ 
rect. Animal identities can be switched as a consequence of the 
similarity of appearance between animals and inadequate use 
of permanent, reliable animal marking methods. The availabil¬ 
ity of genetic marker profiles, originally recorded for purposes 
of phylogeny studies, reestablished appropriate identities of 
switched animals (e.g., Bowling and Ryder, 1988). 

In this report, we focus on deriving the genetic type for 231 
Orlitza III, the last wild caught mare (captured in Mongolia in 
1947) whose offspring were bred at Askania Nova (AN) in 
Ukraine. For breeding management of a species whose pedi¬ 
grees otherwise include only 11 animals captured from the wild 
in 1901-1903, Orlitza’s genome potentially provides signifi¬ 
cant contributions of genes and combinations critical to species 
survival. We also present an extensive genetic marker survey 
incorporating erythrocyte antigens, protein polymorphisms 
and microsatellite data from 152 PHs bred or used at AN. The 
data include 51 loci, covering at least 20 autosomes plus the X 
of the DH (and four loci for which the autosomal assignment is 
unknown). An additional 400 PHs from zoos throughout Eu¬ 
rope and North America, although not as comprehensively pro¬ 
filed as those of the AN group, are combined with information 
from AN animals for a comprehensive marker assessment of 
the phylogenetic distance relationship between PH and breeds 
of DH. 

Materials and methods 

Samples 

Blood, teeth and hair were used as sources of antigens, proteins and DNA 
for blood group, protein and microsatellite variation assays. Tissue cultures 
established from skin biopsies and blood were used as a source of DNA for 
mtDNA studies. Blood samples were collected from 145 living horses bred or 
used at AN. Teeth from another seven animals provided from museum col¬ 
lections at AN and Kiev were used as a DNA source to assist in ascertaining 
and deriving AN founder types. Genetic profiles, including data on erythro¬ 
cyte antigens, protein variants detected by electrophoresis and/or microsatel¬ 
lite markers were also obtained from 416 animals from zoos throughout 
North America and Europe. Animals in this report are identified according 
to their studbook identity (Volf and Kus, 1991) at the time of sampling. DH 
data used for comparison was from the database information of the Veterina¬ 
ry Genetics Laboratory, which included tests of over one million horses from 
at least 50 breeds, although not all samples or breeds were tested for every 
system described here. 

Loci analyzed 

Using standard protocols for serology, protein electrophoresis and frag¬ 
ment length analysis of fluorescent-tagged primer PCR-amplification of 
DNA, seven loci of blood groups, 15 loci of protein polymorphisms and 29 
loci of microsatellites were analyzed for genetic variation (references pro¬ 
vided in Juneja et al., 1984, 1989; Bowling and Clark, 1985; Bowling and 
Dileanis, 1990; Bowling et al., 1990, 1992, 1997) mtDNA sequence was 
determined according to the protocol of Oakenfull and Ryder (1998). The 
locus abbreviations and allelic nomenclature provided in the tables represent 
a consensus standard nomenclature used for DH by member laboratories of 
the International Society for Anim Genet (ISAG). Conclusions of parentage 
exclusion followed robust implementation of laboratory procedures and 
included extensive retesting to verify results. Due to the large number of 
highly polymorphic loci tested for each animal from the AN program (usually 


51 loci, except for animals tested from tooth samples for which only microsa¬ 
tellite profiles were obtained), a qualification of parentage was tantamount to 
proof of parentage (Bowling, et al., 1997). Chromosome assignments for 
genes and microsatellites are from linkage maps assembled for DH by 
Lindgren et al. (1998) and Guerin et al. (2003) and from synteny mapping by 
Shiue et al. (1999). While PH has two more chromosomes than DH (2n = 66 
vs 64), the difference involves a Robertsonian translocation, in which chro¬ 
mosome 5 of DH (ECA5) (a metacentric) is present in PH as two acrocentric 
chromosomes (Ryder et al., 1985; Yang, et al., 2003; Myka, et al., 2003). 

Deriving types 

Although a blood sample was not available from 259 Pegas, his blood and 
microsatellite DNA type could be derived using available offspring and their 
dams. On first analysis, Pegas’ putative offspring provided evidence for a 
profile that would have more than two alleles at several loci, but subsequently 
a subset of offspring was identified that provided a consistent genotype with 
no more than two alleles per locus. The derived DNA profile matched that 
later obtained using an archived tooth from Pegas. No archived material was 
available from Orlitza III, but her type could be derived using genetic profiles 
obtained directly from her son 285 Bars (blood), directly and by derivation 
from her son Pegas, from 146 Robert Orlik (tooth) and from her grandson 
313 Vizor (tooth). 

Genetic distance 

The genetic distance values for pairwise considerations of ten DH breeds 
and PH, based on 38 loci were calculated using DISPAN, Ota (1993). These 
included seven blood group loci (A, C, D, K. P. Q and U; 15 protein polymor¬ 
phisms (at the AP, CA, CAT, HBA, PGD, PGM, GPI, ALB, C3, ES, GC, 
.PLG, TF, PI and XK loci); and 16 microsatellite loci (ASB17, VHL20, 
HTG10, HTG7, HTG4, AHT5, AHT4, HMS6, HMS3, HMS7, HMS1, 
LEX3, LEX33, ASB2, UCDEQ425 and HTG). Thirteen additional microsa¬ 
tellites were tested for the PH diversity study, but only limited DH data were 
available for these loci when the studies were undertaken, so they are not 
included in the genetic distance calculations. Ten breeds representative of 
draft horses, light (riding) horses, racehorses and ponies were used for calcu¬ 
lating allele frequencies: Percheron (PN), Arabian (AR), Paso Fino (PF), 
Iberian (IB), Lipizzaner (LI), Morgan Horse (MH), Trakehner (TK), Thor¬ 
oughbred (TB), Norwegian Fjord (NF) and Miniature (MI). The number of 
DH and PH analyzed for each locus are listed in the Supplemental Table 
(available from the corresponding author or at www.karger.com/doi/ 
10.1159/000075754). Each of these breeds consists of a larger population 
than the PHs, and has a larger number of founders. Probably the LI data in 
this study provide the closest comparison to PH in terms of population 
parameters (founders, current size, number of animals tested). As a means of 
visualizing these distance data, a dendrogram was constructed based on a 
neighbor joining algorithm (NEIGHBOR) using PHYLIP (Felsenstein, 
1993). 

Results 

Genetic markers of PH and variant comparisons with DH 

Blood groups: In tests that detect 52 alleles across ten breeds 
of DH, 19 alleles were found, none unique to PH (Table 1). PH 
is particularly restricted at EAD with three alleles compared 
with 25 in DH. The only fixed locus was EAK - no PHs were 
positive for the single antigen (Ka) identified for this locus. 

Protein polymorphisms: In tests that detect 100 alleles across 
ten breeds of DH, 43 variants were found, seven (at five loci) 
unique to PH (ES-P, GPI-T, XK-P, TF-d, PI-Prz,-Pzl, -Pzk) 
(Table 2). These variants have been described previously 
(Scott, 1979; Bowling and Ryder, 1987; Patterson et al., 1990). 
The only fixed loci (PGD, GC) were those with low allelic 
diversity in DH. 

Microsatellites: In tests that detect at least 297 alleles across 
ten breeds of DH, 137 alleles were found, eight (at seven loci) 
unique to PH (HMS15-223, ASB17-Y, HMS3-T, HMS2-S, -U, 
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EB2E8-T, UCDEQ502-T and LEX22-111) (Table 3). Due to 
the limited information available across breeds for some of the 
microsatellite loci, the number of alleles given for DHs might 
be low. The identification of alleles not found to be represented 
in DH at HMS15, HMS2, EB2E8, UCDEQ502 and LEX22 
might prove incorrect, but the data are not thought to provide a 
significant misrepresentation. 


Table 1 . Blood groups in Przewalski’s horses: origins by founder among 
Askania Nova horses 


ECA a Locus No. of AN source 

alleles b 


DH PH RO plus O Hosana Am c Vjuga 1 

(derived) 0 (derived/ 


2 

EAK 

2 

1 -/ 



8 

EAQ 

6 

4 b c /- 


ac 


(14) 

EAD 

25 

3 adn eg d 



20 

EAA 

12 

5 bee /- 

be 

adf 


adg 

24 

EAU 

2 

2 a/- 



U 

EAC 

2 

2 a/- 



U 

EAP 

3 

2 /- 

acd 



Total 

52 

19 




Equus caballus chromosome assignment, provisional in brackets. 

DH = domestic horse; PH = Przewalski's horse. 

RO = Robert Orlik, O = Orlitza. 

Only partial type derived. 

This blood group found in American horses (Sigor, Sibol, Lisa, Boleta) at AN, 
not in other sources. 

This blood group found in Vjuga and descendants, not in other sources at AN. 


Comparison of variants: Most of the genetic variants in PHs 
are shared with DHs (184/199 alleles, 92%). No species defin¬ 
ing loci were identified. Nonetheless, profiles of individual ani¬ 
mals are usually distinctive from DHs, especially by blood typ¬ 
ing, due to the high frequency of unique PH blood protein vari¬ 
ants at TF, ES and PI (data not shown). Calculations of average 
heterozygosity based on allelic frequencies (Table 4) suggest 
that heterozygosity is within the estimated range for all the DH 
breeds, except the breed with the highest heterozygosity (Mis). 
These heterozygosity comparisons are also provided in Bowling 
and Ruvinsky (2000). 

Deriving the genetic profile for Orlitza 

The derived type for Orlitza’s DNA (microsatellite) markers 
is presented in Table 3. Lacking direct blood typing informa¬ 
tion for Robert Orlik, the sire of her offspring, it was difficult to 
derive Orlitza’s markers in blood group and protein systems. 
However, from consideration of genotypes of PHs worldwide, 
it is clear that Orlitza provided two new alleles for PH, TF-D 
and CA-E, both found in DH. Considering the DNA markers, 
she contributed two other new variants to PHs, HMS3-T, 
EB2E8-T, neither one shared with DH. For the microsatellite 
loci, she was heterozygous at 14 of 25 autosomal loci (56%), a 
minimum estimate since it is based only on transmission data 
to three offspring. 

Linkage relationships among loci 

Several of the tested loci are known to be syntenic in DH, 
specifically loci on ECA1, 4, 10, 15, 16, 24 and X (see chromo¬ 
some assignments given in Tables 1-3). Among those known to 
be linked in DH, the same relationships appeared to be true for 


Table 2. Protein polymorphisms in Przewalski’s horses: origins by founder among Askania Nova horses 


ECA a 

Locus 

No. of alleles 1 ’ 

AN: derived types 0 

AN 


Elsewhere, not in AN 

(Introgression at AN?) 



DH 

PH (u) 

(RO + 

O) 

(H) 

American 

d 

founders 

RO dam 0 

Vjuga 

02 

PGD 

3 

1 

F 







03 

GC 

2 

1 

F 







03 

ALB 

3 

2 

B 


A 





03 

ES 

12 

4(1) 

P f 


H 



I # 8 



05 

PGM 

3 

2 

F 




S 

S 


07 

C3 

5 

3 

2 


3 

4 



1 

(07) 

CAT 

3 

3 

F 

S 


X 





(07) 

APOA4 

4 

2 

F 

S 






09 

CA 

6 

2 

E 

I 






10 

GPI 

5 

2(1) 

I 


T 





10 

XK 

4 

3(1) 

F 

K 


P 





13 

IIB 

5 

2 

I 

II 






16 

TF 

15 

7(1) 

d 

D 

F3 

F2 


E I J 



24 

PI 

25 

6(3) 

Prz 

Pzl 1 


L 

SI 

Pzk P 

L2 


U 

SP3 

5 

4 

FI F2 

1 


S 


Total 

100 

44 (7) 









a 

b 

c 

d 

e 

f 

g 


Equus caballus chromosome assignment, provisional in brackets. 

DH = domestic horse, PH = Przewalski's horse, u = unique to PH, not in DH. 
RO = Robert Orlik, O = Orlitza, H = Hosana. 

Additional variation at AN contributed by Sigor, Sibol, Lisa, Boleta. 

Dam of Viola, Vira, Vetla, Vorot. 

Crosshatching denotes PH allele. 

Null allele. 
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PH (Bowling, unpublished results). Linkage disequilibrium 
among linked loci on ECA4, 10, 15, 24 and X allowed the add¬ 
ed power of haplotyping to point to parentage solutions. 

Pedigree analysis 

Initially, to derive the genetic profile of Orlitza III we used 
only blood samples obtained from animals at AN in 1991 and 
1992 and from AN-bred horses elsewhere (285 Bars, 601 Vira, 
605 Vorot, 606 Vulkan, 831 Varna, 826 Vata, 832 Plunzer, 
1118 Garant and 1048 Moros). Analysis of blood group and 
protein polymorphisms and a limited set of DNA markers 
identified pedigree incompatibility problems that needed to be 
sorted out before the profiles could be derived. For example, 
the collection of horses assigned to be sired by Pegas could not 
all be offspring of a single stallion (data not shown). Likewise, 
offspring attributed to Volga, daughter of Orlitza III, could not 
all be the offspring of a single mare (data not shown). Addition¬ 
ally, among offspring that represented inbred combinations of 
Robert Orlik and Orlitza III with no other founder input, more 


alleles were present at some loci than could be provided by two 
horses (e.g., six alleles at ASB17, that is, two more than could be 
provided by two horses; four alleles at LEX3 (X-linked), one 
more than could be provided by a single mare and stallion) (Ta¬ 
ble 5). While the data clearly provided evidence to exclude par¬ 
entage relationships, correct parentage could not be assigned 
without additional information. For this task we increased the 
number of loci tested and tested samples (teeth) from AN 
genetic founder animals. We had biological samples (blood or 
teeth) from all AN founders except Orlitza III (samples from 
Robert Orlik, 283 Hosana, 396 Vada, 533 Sigor, 1128 Sibol, 
812 Boleta and 846 Lisa), although in the last analysis we were 
unable to obtain results from Vada’s tooth sample. We also 
obtained a profile from a tooth from 295 Sixtus, an infertile 
stallion imported to AN from Germany with no studbook- 
assigned foals. We had no material from another stallion at AN 
with no assigned foals (79 Tornado/Vasik), but with a similar 
pedigree to Hosana and Vada. 


Table 3. Microsatellite polymorphisms in Przewalski’s horses: origins by founder among Askanis Nova horses 


ECA a Locus 


No. of alleles 1 ’ 


Allelic variation in Przewalski's horses at AN and elsewhere 
At AN 


At AN, not elsewhere 
Outside AN (Introgression?) 


DII 


PI I (u) Robert Orlik 


(Orlitza)' 


Ilosana 


American Additional variants Unkn RO Vjuga Vika, Sosna, 
horses d dam 1 Moros, Potok 


01 

01 

01 

02 

04 

04 

04 

08 

08 

08 

09 

09 

10 

10 


16 

21 

24 

26 


X 


HMS7 

HMS15 

UCDEQ487 

ASB 17 

HTG7 

LEX33 

IIMS6 

AHT5 

UCDEQ46 

LEX23 

HTG4 

HMS3 

ASB9 

HMS2 


10 

15 

9 

22 

5 

12 

8 

11 

5 

11 

8 

11 

9 

9 


5 

8 ( 1 ) 

6 

8 ( 1 ) 

4 

5 

3 

4 
2 

4 

5 

6 ( 1 ) 

5 

6 ( 2 ) 


L 

229 

P 

H 

M 

P 

L 

K 

M 

250 

N 

P 

O 

J 


Q 

231 

# 

S 

N 

S 

N 

N 


Q 

R 

K 


(O) 

(217) 

(M) 

(D) 

(K) 

(R) 

(L) 

(K) 

(L) 
(236) 
(K) 

(M) 
(M) 
(M) 


f 


(#) 

(O) 

(T) 

(P) 

(S) 

(O) 

(246) 



N 

P 

N 

K 

nt h 

250 



N 


M 


I 


M 


O 

Q 


(T) 
(O) 
(N) 


M 

1 

N 

S 


P 

u 


240 


226 


L 


L 


UCDEQ505 

HTG10 

AHT4 

EB2E8 


10 

12 

11 

8 


4 
7 

5 

3(1) 


M 

K 

H 

K 


N 

O 

N 

N 


(Q) 
(J) 

(N) 
(T) 


(J) 




O 


10 

UCDEQ412 

12 

3 

K 

P 

(P) 


K 

P 

R 




O 

15 

ASB15 

10 

4 

E 

N 

(E) 


I 



P 


M 

P 

15 

ASB2 

14 

5 

B 

M 

(K) 

(M) 

B 

M 

N 

R 

I 



15 

HTG6 

11 

3 

N 


(N) 


N 


R 

O 

J 

I 


15 

HMS1 

8 

3 

K 

M 

(M) 

(N) 

K 




J 




R 


M 


■i 


28 

UCDEQ425 

11 

6 

K 

L 

(J) 

K 

O 

F 

I 




I 

30 

VHL20 

10 

4 

M 

R 

(O) 

(R) 

p 







I 

X 

UCDEQ502 

9 

6(1) 

T 


(O) 


F 


N 

P 

s 


K 


X 

LEX27 

6 

3 

200 


(198) 


202 







194 

X 

LEX22 

6 

3(1) 

115 


(105) 


111 

115 




113 ■ 


LEX3 


14 


(L) 


(O) 


H 


M 


Total 297 136(8) 


Equus cabal/us chromosome assignment. 

DH = domestic horse, PH = Przewalski's horse, u = unique to PH, not in DH. 
Derived using types of Robert Orlik, Pegas, Bars and Vizor. 

Additional variation at AN contributed by Sigor, Sibol, Lisa, Boleta. 

Dam of Viola, Vira, Vetla, Vorot. 

Null allele. 

Crosshatching denotes PI I allele. 

Not tested. 
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Table 4. Estimated average heterozygosity at 
38 polymorphic loci for ten breeds of horses and 
Przewalski’s horse, arranged from lowest to high¬ 
est values. Sixteen erythrocyte antigen loci, seven 
protein polymorphisms and 15 microsatellite loci 
were analyzed (see Materials and methods). Ten 
individuals of each breed were used for estimat¬ 
ing genetic diversity 


Table 6. Examples of genetic analysis of parentage, considering the studbook pedigree and the likely 
true pedigree based on up to a total of 51 loci of blood type and DNA markers 


Breed or taxon 3 

Estimated average heterozygosity 
(±SD) 

TB 

0.461 (±0.047) 

LI 

0.473 (±0.042) 

PH 

0.474 (±0.044) 

AR 

0.478 (±0.045) 

IB 

0.491 (±0.046) 

TK 

0.511 (±0.043) 

NF 

0.531 (±0.039) 

PN 

0.535 (±0.041) 

MH 

0.537 (±0.042) 

PF 

0.551 (±0.045) 

MI 

0.579 (±0.038) 

3 AR: Arabian; IB: Iberian; LI: Lipizzan; MI: 

Miniature Horse; 

MH: Morgan Horse; NF: 

Norwegian Fjord; PF: Paso Fino; PN: Percheron; PH: 
Przewalski's horse; TB: Thoroughbred; TK: 

Trakehner. 




Year of 

Studbook pedigree 

Genetic analysis 3 


Genetic pedigree b 


birth 

Sire 

Dam 

Mating 

Sire 

Dam 

Sire 

Dam 

490 Vetka 

71 

Vizor 

Volga 

EXC 

EXC 

nt 

Pegas 

Viola 

524 Viola 

72 

Vizor 

Volga 

EXC 

EXC 

nt 

Gordyj 

?§ 

548 Vena 

73 

Pegas 

Volga 

EXC 

EXC 

nt 

Gordyj 

Vetla 

601 Vira 

74 

Pegas 

Volga 

EXC 

Q 

nt 

Pegas 

?§ 

606 Vulkan 

74 

Pegas 

Vetka 

EXC 

Q 

EXC 

Pegas 

Hosana 

765 Vetla 

78 

Pegas 

Vira 

EXC 

EXC 

EXC 

Vizor 

?§ 

766 Veska 

78 

Pegas 

Vena 

EXC 

Q 

EXC 

Pegas 

Vetla 

826 Vata 

79 

Pegas 

Vena 

EXC 

EXC 

EXC 

Gordyj 

9 * 

• 

831 Varna 

79 

Pegas 

Vetka 

EXC 

Q 

EXC 

Pegas 

Viola 

832 Plunzer 

79 

Pegas 

Vira 

EXC 

Q 

EXC 

Pegas 

? 

• 

843 Vaflja 

79 

Pegas 

Volga 

EXC 

Q 

nt 

Pegas 

? 

• 

893 Volsebnik 

80 

Pegas 

Vena 

EXC 

Q 

EXC 

Pegas 

Vira 

970 Paris 

81 

Pegas 

Volga 

EXC 

Q 

nt 

Pegas 

9 * 

• 

3 nt = not tested; EXC = excluded as parent; Q = qualifies as parent. 
b ? § = mare by Robert Orlik out of unknown dam; ?* = mare by Vizor out of Hosana (L 

e., Golubka). 



Table 5. Selected loci of microsatellites for horses that by studbook trace 
only to Robert Orlik plus Orlitza III showing the presence of more alleles 
than can be accounted for by a single breeding pair 


Horses 


Alleles present at microsatellite loci 



ASB17 

HTG7 

HTG4 

LEX3 

490 Vetka 

NT 

N 

KN 

LM 

524 Viola 

NY 

MN 

NP 

IM 

548 Vena 

SY 

NO 

KN 

MO 

601 Vira 

HS 

MN 

N 

L 

606 Vulkan 

TU 

KN 

MN 

M 

765 Vetla 

HS 

MO 

N 

IO 

766 Veska 

HS 

KM 

KN 

IL 

826 Vata 

HT 

NP 

MP 

M 

831 Varna 

TY 

KM 

KP 

LM 

832 Plunzer 

H 

KM 

KN 

O 

843 Vaflja 

H 

NP 

KQ 

LO 

893 Volsebnik 

ST 

MN 

KN 

L 

970 Paris 

T 

KP 

MN 

M 

Alleles 

H,N,S,T,U,Y 

K,M,N,0,P 

K,M,N,P,Q 

I,L,M,0 


From pedigree analysis of 139 animals, genetically compat¬ 
ible parentage matches were identified for 110. Another 29 
could not be confirmed since we did not have two qualifying 
parents. In a few cases, a studbook-listed parent was dead or not 
tested. In others the studbook-listed parentage was tested and 
excluded, but no qualifying parent could be found. Primarily 
the foals were sired by Pegas (28), as well by his sons 391 Gor- 
dyj (3), 821 Parad (26) and 970 Paris (27). Due to the extensive 
genetic profiles obtained for all the horses, despite the similari¬ 
ty in pedigrees, only a single qualifying sire was identified for 


each offspring (all others could be excluded). No offspring were 
attributed to Sixtus by studbook record or through genetic test¬ 
ing. The pedigree corrections have been provided to the man¬ 
agement staff at AN and to the studbook keeper. 

Primarily the insoluble problems appeared to be in the pedi¬ 
grees of animals foaled prior to the mid-1980’s. For example, 
among the 20 tested animals foaled from 1971 to 1980, none 
could be verified to have a correctly assigned pedigree. For the 
most part, by pedigree these animals represented crosses of 
descent from Robert Orlik and Orlitza III, but from genetic 
tests, the founder inputs were more complex and included 
Hosana as well as unidentified horses. Among several of the 
older horses (e.g., Viola, Vira, Vetla) for which a sire could be 
identified (see Table 6), a genetic profile could be established 
for a dam that could be a daughter of Robert Orlik, but 
included alleles not found either in the profile of Robert Orlik, 
of Hosana or the derived type of Orlitza III. For example, see 
ASB17-N and HTG7-0 among profiles in Table 5, not present 
in AN founder types provided in Table 3. Accounts of horses at 
AN suggest that PH/DH hybrids sired by Robert Orlik were 
present (Treus, 1962; Garrutt et al., 1966). A possible explana¬ 
tion for the presence of alleles at AN not present in AN PH 
founders was that one of these hybrid mares had been misiden- 
tified as a PH, probably as Volga. A critical test of this hypothe¬ 
sis was mtDNA analysis for horses tracing to Volga in matrili- 
neal line. A mtDNA type had been determined for Bars, 
Oakenfull and Ryder (1998) which should be shared by his sis¬ 
ter Volga and her matrilineal descent. In support of the hybrid 
hypothesis, control region mtDNA haplotypes of 490 Vetka, 
601 Vira and 831 Varna were identical but did not match that 
of Bars (Fig. 1). 

Another problem foundation pedigree was that of Vjuga (by 
pedigree: Vizor * Vada). She was excluded to Vizor and no 
alternative sire could be identified. Unfortunately the Vada 
tooth did not yield PCR-amplifiable DNA so we had no genetic 
profile for Vada. Vjuga had several distinctive variants, not 
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Female Founder 
SB#40 

A 



Female Founder 
SB#12 

A 



= Female Founder 
Orlitza 



Female Founder 
SB#52 


Flaplotype 

PH-I 


Nucleotide position 

111111111111111111111111111 
555555555555555555555555555 
555555556666667777778888899 
234678990004562337771 167757 
142985562349060790150380164 



PH-II 


. . T T . G G 


GA....CC..C.C.. 


AN-probands 


T 


T . C C . . 


Fig. 1 . Partial pedigree of Askania Nova Przewalski’s horses. The mitochondrial haplotype of Orlitza III (231) was determined 
by evaluating her offspring, Bars (285). This haplotype corresponds to a known PH haplotype (PH-II). Reputed descendants of 
Volga (244) displayed a single haplotype not observed in Przewalski’s horses (denoted AN-probands in the haplotype diagram 
below the pedigree). Nucleotide positions in the haplotype diagram correspond to the numbering of the E. caballus complete 
mitochondrial sequence of Xu and Arnason (1994). 


present in other pedigrees at AN, and also not present in PHs 
outside of AN (for example, C3-1, HTG6-I, UCDEQ502-K; 
Tables 2 and 3). The pedigree elements of Vada and Hosana are 
common in PH pedigrees worldwide and we could anticipate 
that factors present in Vada but not Hosana would be found 
outside of AN. Thus, we also propose introgression from a sec¬ 


ond source into the Vjuga line, possibly through the sire, since 
no qualifying stallion was identified. 

Finally, as also presented in Tables 2 and 3, there was a third 
set of factors in the horses 898 Potok, 966 Vika, 1048 Moros 
and 1215 Sosna that could not be accounted for from AN foun¬ 
ders. 
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Fig. 2. Neighbor-Joining dendrogram of ten breeds of domestic horses 
and Przewalski’s horse. Genetic distances were calculated and dendrograms 
produced as described in Materials and methods. Numbers adjacent to nodes 
represent boostrap percent values from 1,000 replications of resampled loci. 
Breed designations are abbreviated as in Table 4. 


Phytogeny 

As in other genetic distance studies, in our present effort PH 
provides the most dissimilar of the paired comparisons be¬ 
tween DH breeds or between PH and DH breeds (Table 7). As a 
means of visualizing these distance data, a dendrogram was 
constructed based on a neighbor-joining algorithm (Fig. 2), 
likewise showing the outgroup position of the PHs compared 
with DHs. These data and the figure are also presented in Bowl¬ 
ing and Ruvinsky (2000). 

Discussion 

Genetic diversity among PHs 

Genetic diversity is of special significance for endangered 
species such as PH. The potential for inbreeding problems is a 
constant concern due to the small founder numbers and the 
restricted breeding bases conveniently available for zoos. In- 
breeding leads to homozygosity or the possibility of pairing 
recessive deleterious or lethal alleles. Since the species is extinct 
in the wild, genetic variation cannot be augmented from that 
resource. Zoological societies have agreed to manage the genet¬ 
ic variation in the captive populations to minimize inbreeding 
and maintain approximately the present levels of variability 
(Ryder et al., 1984; Princee et al., 1990; Zimmermann, 1997). 
While the total number of alleles found in PHs for the loci ana¬ 
lyzed in this study is slightly under half that of DHs (199 vs 
449), nonetheless the allelic frequency distributions are such 
that theoretically the animals can exhibit as much heterozygosi¬ 
ty as within breeds of DH (see Table 4). While individual ani¬ 
mals, especially the products of inbred pedigrees, may be rela¬ 
tively homozygous, management schemes whose goals are to 
minimize inbreeding should be able to generate animals with 
comparatively high levels of heterozygosity. 


Table 7. Standard genetic distances (± SD) 
between ten breeds and Przewalski’s horse based 


MH 

NF 

PF 

TB 

TK 

AR 

LI 

PN 

IB 

MI 

on 38 polymorphic loci as described in Materials 
and methods 

NF 

0.109 
± 0.02 











PF 

0.057 

0.124 











± 0.01 

± 0.02 










TB 

0.114 

0.214 

0.129 










± 0.03 

± 0.05 

± 0.03 









TK 

0.069 

0.152 

0.094 

0.041 









± 0.01 

± 0.04 

± 0.02 

± 0.01 








AR 

0.078 

0.175 

0.099 

0.105 

0.065 








± 0.02 

± 0.04 

± 0.02 

± 0.02 

± 0.02 







LI 

0.113 

0.194 

0.126 

0.202 

0.155 

0.139 







± 0.02 

± 0.05 

± 0.02 

± 0.05 

± 0.03 

± 0.03 






PN 

0.079 

0.115 

0.092 

0.194 

0.150 

0.156 

0.132 






± 0.02 

± 0.03 

± 0.02 

± 0.04 

± 0.03 

± 0.03 

± 0.03 





IB 

0.107 

0.168 

0.092 

0.170 

0.137 

0.109 

0.199 

0.139 





± 0.03 

± 0.03 

± 0.02 

± 0.04 

± 0.03 

± 0.03 

± 0.04 

± 0.03 




MI 

0.091 

0.092 

0.083 

0.182 

0.141 

0.151 

0.136 

0.089 

0.137 




± 0.02 

± 0.02 

± 0.02 

± 0.04 

± 0.03 

± 0.03 

± 0.03 

± 0.02 

± 0.03 



PH 

0.345 

0.354± 

0.323± 

0.382± 

0.382± 

0.394± 

0.394± 

0.344± 

0.389± 

0.308± 



± 0.08 

0.08 

0.08 

0.09 

0.09 

0.09 

0.09 

0.08 

0.09 

0.07 
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Genetic profile for 231 Orlitza III 

In the earliest samples we tested in the mid-1980’s from AN 
representatives tracing to Orlitza III, we noted the conspicuous 
presence of TF-D and CA-E, which we had not previously 
detected in PHs (Bowling and Ryder 1987). Recently, studies 
showed that the mtDNA type from Bars (son of Orlitza III) 
matched the haplotype of two of the earlier wild caught horses, 
so there was also evidence for genetic similarity of Orlitza III 
with other PHs. Although Orlitza III died in 1973, we were able 
to derive a comprehensive genetic profile at 29 microsatellite 
loci based on the markers that she transmitted to her offspring 
Bars, Pegas and Volga (to Vizor). From her derived type con¬ 
sidering 51 loci (see Tables 1-3), it is apparent that although 
she brought in new variants for the species at four loci, Orlitza 
III shared most of her alleles with those of the earlier PHs. The 
genetic legacy of Orlitza III survives in current pedigrees 
through three offspring Bars, Pegas and Volga, each of which 
contained a distinctive sampling of her genome. 

Pedigree problems and reassignments 

Unanticipated pedigree problems were encountered in the 
process of deriving the genetic profile for Orlitza III. In our 
final analysis we were able to sort out pedigrees for most but not 
all of the animals. We found genetic variants other than those 
provided by the AN foundation stock. Our data suggested that 
the Orlitza III daughter Volga may have been represented only 
by her son Vizor. mtDNA typing substantiated the hypothesis 
of introgression into PHs through horses alleged to be offspring 
of Volga. Additional possibilities of introgression were also 
identified. 

As increasingly revealed by molecular studies in other taxa, 
maintaining without introgression a wild species that is inter- 
fertile with other species or subspecies is a difficult proposition. 
Other examples of introgression in endangered species include 
Asiatic lions (O’Brien, et al., 1987; Driscoll, et al., 2002), Bor¬ 
nean and Sumatran orangutans (Ryder and Chemnick, 1993), 
and American bison (Ward, et al., 2001). Species preservation 
programs are built on a complex platform of priorities. The 
specter of introgression is but one of the issues facing conserva¬ 
tion projects. In the case of PHs, the animals without either the 
previously Mongolian domestic documented introgression, 
(Volf and Kus, 1991) or the introgression reported here, repre¬ 
sent a restricted breeding group with documented fertility 
problems (Bader et al., 1990; Hegel et al., 1990). 

Phylogenetic relationships 

The extra chromosome pair in PH compared with DH 
makes it conceptually difficult to put them on a direct line of 
relationship. Perhaps not surprisingly, the extended genome 
coverage of this study does not contradict previous dendro¬ 
grams based on nuclear genes - PH remains as an outgroup to 
the domestic horse breeds (Bowling and Ryder, 1987; Dubrov¬ 
skaya et al., 1992; Tikhonov et al., 1998). This conclusion per¬ 
sists, despite the accepted presence of introgression from DH 
represented in published pedigrees by the Mongolian domestic 
mare and the additional introgression proposed here. Further¬ 
more, comparison of PH and DH mtDNA control region 
sequences with those of Pleistocene horses revealed a closer 


similarity of PH and DH mitochondrial DNA than either has 
to Pleistocene horses (Vila, et al., 2001). 

Conclusion 

Orlitza III is the only PH species founder from which a 
derived genetic profile has been obtained. The genetic profile 
for Orlitza compared with that of the descendants of the earlier 
11 wild caught animals provides evidence that she contributed 
new genetic variants to PHs, including two alleles apparently 
unique to PH. She had overlap with domestic horses at substan¬ 
tially all genetic loci, as have previously tested animals. In the 
efforts to derive her type using her descendants bred at AN, 
incorrect pedigrees were found. With extended genetic profil¬ 
ing it was possible to ascertain the correct parentage for most of 
the horses. However, overall more alleles were present than 
could be accounted for from the AN founders, findings that are 
most straightforwardly explained by introgression from DH. 
The pool of PHs without introgression is small and there are 
serious concerns about whether that subset can maintain the 
species. Despite introgression, our data indicate that the Prze- 
walski’s horse stands as an outgroup to breeds of domestic 
horses in the dendrograms based on nuclear genes (Fig. 2). The 
preservation of the Przewalski’s horse gene pool necessitates 
incorporation of all the animals with all founder ancestries, 
including those involving introgression. 
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Abstract. Complete sets of chromosome-specific painting 
probes, derived from flow-sorted chromosomes of human 
(HSA), Equiis caballus (ECA) and Equus burchelli (EBU) were 
used to delineate conserved chromosomal segments between 
human and Equus burchelli , and among four equid species, E. 
przewalskii (EPR), E. caballus, E. burchelli and E. zebra hart- 
mannae (EZH) by cross-species chromosome painting. Ge¬ 
nome-wide comparative maps between these species have been 
established. Twenty-two human autosomal probes revealed 48 
conserved segments in E. burchelli. The adjacent segment com¬ 
binations HSA3/21, 7/16p, 16q/19q, 14/15, 12/22 and 4/8, pre¬ 


sumed ancestral syntenies for all eutherian mammals, were also 
found conserved in E. burchelli. The comparative maps of 
equids allow for the unequivocal characterization of chromo¬ 
somal rearrangements that differentiate the karyotypes of these 
equid species. The karyotypes of E. przewalskii and E. caballus 
differ by one Robertsonian translocation (ECA5 = EPR23 + 
EPR24); numerous Robertsonian translocations and tandem 
fusions and several inversions account for the karyotypic differ¬ 
ences between the horses and zebras. Our results shed new light 
on the karyotypic evolution of Equidae. 

Copyright©2003 S. Karger AG, Basel 


The family Equidae (horses, zebras and asses) comprises 
seven extant species (Nowak, 1999) that shared a common 
ancestor ~ 1.9-2.3 million years ago, with the extant species 
emerging at approximately 0.89-1.07 million years ago accord¬ 
ing to the latest estimate (Oakenfull et al., 2000). The equids are 
remarkable both for their rapid karyotypic diversification as 
well as variation in diploid numbers which range from 2n = 32 
in Hartmann’s mountain zebra {Equus zebra hartmannae ; Be- 
nirschke and Malouf, 1967) to 2n = 66 in Przewalski’s horse 
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(E. przewalskii', Benirschke at al., 1965). Although early con¬ 
ventional chromosome banding comparisons made it possible 
to identify several likely homologues among extant species, the 
complexity of the genomic rearrangements confounded at¬ 
tempts to provide a genome-wide view of the modes and tempo 
of chromosomal change in the various equid lineages (Ryder et 
al, 1978). 

Cross-species chromosome painting (Scherthan et al., 1994) 
in combination with chromosome sorting, comparative chro¬ 
mosome banding and digital imaging microscopy, offers an 
extremely powerful approach for delimiting true regions of 
chromosomal homology in mammals, which are essential to 
attempts to develop genome-wide homology maps among 
mammalian species (Yang et al., 1995). It is particularly perti¬ 
nent to comparisons between distantly related species, species 
with highly rearranged karyotypes, as well as taxa for which 
mapping and other genomic data are rare or absent (for review 
see Chowdhary and Raudsepp, 2001). We have reexamined the 
karyotypic relationships among the domestic horse (E. cabal¬ 
lus), Przewalski’s horse (E. przewalskii), Burchell’s zebra 
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(E. burchelli) and Hartmann’s mountain zebra (E. z. hartman¬ 
nae) by cross-species chromosome painting and present here 
the first genome-wide comparative chromosome maps of these 
four equid species. The taxonomy of the equids is a subject of 
debate. For ease of presentation we follow Ryder et al. (197 8) in 
recognizing E. przewalskii as a distinct species although some 
authors include E. przewalskii in E. caballus] Wilson and Reed¬ 
er (1993), Nowak (1999). Most recently Groves and Ryder 
(2000) designate the domestic and Przewalski’s horses as E. fe¬ 
rns ]. In addition, we provide a comparative map between 
human and Burchell’s zebra. 


Materials and methods 

Metaphase preparations 

Fibroblast cell lines of four equid species were used in this study. Cell 
lines of E. caballus and E. burchelli were provided respectively by the Kun¬ 
ming Cell Bank of the Chinese Academy of Sciences and University of Cape 
Town. Cell lines for E. przewalskii and is. z. hartmannae wqyq obtained from 
the Zoological Society of San Diego Center for Reproduction of Endangered 
Species. Metaphase chromosomes were prepared from fibroblast cultures 
grown at 37 °C in Dulbecco’s modification of minimal essential medium 
(GIBCO) enriched with 10% fetal bovine serum (GIBCO), penicillin (100 
units/ml) and streptomycin (100 mg/ml). Chromosome preparations were 
made following standard procedures that included a 15-min hypotonic treat¬ 
ment in 0.4% KC1, fixation in 3:1 methanokglacial acetic acid, and air¬ 
drying. 

Flow sorting and generation of chromosome-specific painting probes for 

E. burchelli 

Chromosomes of E. burchelli were sorted on a dual laser cell sorter (FAC- 
Star Plus, Becton Dickinson) as previously described (Yang et al., 1995). 
Chromosome-specific painting probes were made by degenerate oligonucleo¬ 
tide-primed PCR (DOP-PCR) amplification of flow-sorted chromosomes 
(Telenius et al., 1992). DOP-PCR amplified chromosome-specific DNAs 
were labeled during the secondary PCR by either incorporating biotin-16- 
dUTP (Roche), fluorescein-12-dUTP (Roche) or Cy3-dUTP (Amersham). 
The generation and characterization of human and E. caballus chromosome 
painting probes have been previously described (Ferguson-Smith, 1997; 
Yang et al., in press). 

Nomenclature 

The E. caballus chromosomes were identified according to the interna¬ 
tional standard nomenlature for E. caballus (Bowling et al., 1997); E. prze¬ 
walskii chromosomes were arranged and numbered in most part following 
Ryder et al. (1978) and the international standard nomenclature for E. cabal¬ 
lus (ISCNH, 1997). The E. z. hartmannae karyotype follows that of Richard 
et al. (2001). The E. burchelli chromosomes were arranged according to 
decreasing length. 

Fluorescence in situ hybridization 

Comparative chromosome painting between E. burchelli and human fol¬ 
lowed Yang et al. (1997, 2003). For comparative painting among the equid 
species hybridization time was 16-24 h and the temperature of the post¬ 
hybridization washes was 45 0 C. No equid competitor DNA was used in the 
hybridization protocol. In cases where identification of chromosomes by 
DAPI (4 / ,6-diamidino-2-phenylindole) banding was ambiguous, sequential 
trypsin G-banding (Seabright, 1972) and 2-7 color FISH experiments were 
performed. Briefly, metaphase slides were baked at 65 °C for 3 h and then 
treated with 0.005 % trypsin for 8-12 min before staining with 2% Giemsa 
for 10 min. After image capture of G-banded metaphases using the CytoVi- 
sion system, the immersion oil and Giemsa stain were removed by serially 
washing the slide for 5 min in 100 % ethanol followed by 100 % methanol. 
The slides were then baked at 65 0 C for at least 1 h. The G-banded slides were 
subsequently denatured in a 70% formamide/30% 2x SSC (v/v) solution at 
60 0 C for 20-30 s. The hybridization, post-hybridization washes and detec¬ 
tion conditions follow the procedure outlined above. In the case of multicolor 


FISH, probes were labeled with biotin-, FITC- and Cy3-dUTP according to 
the combinatorial labeling procedure proposed by Ried et al. (1992) and 
visualized with avidin-Cy5 and rabbit anti-FITC and FITC-conjugated goat 
anti-rabbit antibodies. 

Results 

E. caballus - E. przewalskii comparison 

To establish genome-wide homologies between E. caballus 
(2n = 64) and E. przewalskii (2n = 66) we hybridized the full 
complement of ECA painting probes (ECA1-31, X) onto E. 
przewalskii metaphases. Examples of comparative painting are 
shown in Fig. la and b and the summary of genome-wide corre¬ 
spondence between these two species is presented in Fig. 2. Our 
results confirm earlier investigations that showed that one 
Robertsonian translocation differentiates the karyotypes of 
these species (Benirschke et al., 1965; Short et al., 1974; Ryder 
et al., 1978). Our results provide for the unequivocal identifica¬ 
tion of the chromosomes that have been involved in the karyo¬ 
typic divergence of these two horse species (i.e. ECA5 and 
EPR23 and EPR24). ECA5 can be reconstructed from the acro- 
centrics EPR23 and 24 via one centric fusion which accounts 
for the observed difference in 2n between them. 

Reciprocal chromosomal painting between E. caballus and 

E. burchelli 

We were able to make chromosome-specific painting probes 
for 15 of the 22 E. burchelli chromosome pairs (EBU1-7, 9, 11, 
15-19, and 21). EBU8 and X were found in the same flow- 
peak, as were EBU 10 and EBU 12, EBU 13 and 14, and EBU20 
and one homologue of EBU 19 (Fig. 3). Paints derived from 
EBU1, 8 + X, 10 + 12, and 17-21 show strong cross-hybridiza¬ 
tion to the heterochromatic regions of these chromosomes. In 
particular, this was most marked at lpter, 12pter, 17-21qter 
and the interstitial heterochromatic region of Xq (data not 
shown) and is likely to be due to the existence of homologous 
repetitive sequences in these regions. 


Fig. 1 . Examples of cross-species chromosome painting, (a, b) Simulta¬ 
neous painting of a G-banded metaphase of Equus przewalski (EPR) with 
probes for eight E. caballus (ECA) chromosomes by multicolor FISH. The 
color for each probe is shown to the left. Note that probe for E. burchelli 
(EBU) chromosome 17 (= ECA5q) was added to differentiate the ECA5q 
from ECA5p. The painting result demonstrates that ECA5q = EPR23 (ar¬ 
rows) and ECA5p = EPR24 (arrowheads), (c) Hybridization of EBU 17 probe 
to the proximal region of human (HSA) lp. (d) Hybridization of HSA9 probe 
onto the proximal part of EBUlq and to EBU6p. (e, f) Simultaneous hybrid¬ 
ization of ECA11-13, 15, 17-19 and 22 probes onto metaphases of E. bur¬ 
chelli (e) and E. z. hartmannae (f) by multicolor FISH, with the color of each 
probe given to the left. In both instances only the identity of one of the two 
homologues is shown. Note that EZH13 is painted by probes from ECA 13 
and 18 (f), and that EBU4 is painted by probes from ECA 18 and 19. (g, h) 
Simultaneous hybridization of EBU 1, 3, 7, 9, 11, 15 and 17 probes onto 
metaphases of E. z. hartmannae (g) and E. caballus (h) by multicolor FISH, 
with the color of each probe indicated to the left. On the metaphase spread 
the identity of one of the two painted homologues plus the X and Y are 
shown. EBU 17 probe (in green) shows strong cross-hybridization to the telo- 
meric regions of several EZH chromosomes. It also hybridizes to heterochro¬ 
matic regions on Xq and Y in both E. burchelli and E. z. hartmannae. 
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Fig. 2. Summary of genome-wide chromo¬ 
somal correspondence between E. caballus and E. 
przewalskii with G-banded karyotype of E. prze- 
walskii as the reference. The identities of E. prze¬ 
walskii chromosomes are shown below each ho¬ 
mologous pair; the numbers of corresponding E. 
caballus chromosomes are shown to the right. The 
insert demonstrates that ECA5 can be recon¬ 
structed from EPR23 and EPR24 via a centric 
fusion. 


Fig. 3. Bivariate flow karyotype of E. burcbel¬ 
li. Note that EBU8 and X were found in the same 
flow-peak, as were EBU10 and EBU12, EBU 13- 
14, and EBU20 and one homologue of EBU19. 
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Reciprocal painting was used to define unambiguously the 
genome-wide homologies that exist between E. caballus and 
E. burchelli. Our approach was, first, to hybridize the complete 
set of E. caballus paints (ECA1-31, X) to metaphase chromo¬ 
somes of E. burchelli. Six autosomal probes (ECA2, 3, 5, 6, 8, 
and 10) each produced signals on two pairs of E. burchelli chro¬ 
mosomes; the remaining 25 autosomal painting probes and the 
X each hybridized to a single pair of EBU chromosomes. In 
total, the 31 EC A autosomal probes delimited 37 homologous 
segments in the E. burchelli genome (Fig. 4). Secondly, we 
hybridized paints derived from all the E. burchelli flow-peaks 
(including those that contain two types of chromosomes) to E. 
caballus metaphases to resolve the sub-chromosomal homolo¬ 


gies of E. caballus that correspond to multiple E. burchelli chro¬ 
mosomes (or chromosomal segments) and vice versa. Examples 
of the reciprocal painting are shown in Fig. le and h and the 
chromosomal correspondence between these two species is 
summarized in Fig. 4. Although four probes each represent two 
types of EBU chromosomes (i.e. EBU8 and X, EBU 10 and 12, 
EBU 13 and 14, EBU 19 and 20), the reciprocal painting results 
allow for the establishment of one-to-one correspondence be¬ 
tween conserved chromosomal segments in the genomes of 
E. caballus and E. burchelli (Fig. 4). In brief, seven E. burchelli 
chromosomes (EBU2, 14, 16, 18-21) are each homologous to 
one entire ECA chromosome. EBU 17 is homologous to 
ECA5q; EBU1, 11 and 12 are each homologous to three ECA 
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chromosomes and/or chromosomal arms. The remaining 10 
EBU autosomes (3-10, 13 and 15) each correspond to two ECA 
chromosomes and/or chromosome arms. Most of the interspe¬ 
cies homologues show a high degree of conservation in G-band- 
ing patterns. A notable exception to this is EBU 14 and its horse 
homologue ECA7 (= EPR6; Fig. 2) which differ in banding pat¬ 
tern, probably as a result of a pericentric inversion. 

Reciprocal painting between human and E. burchelli 

Cross-species reciprocal painting was used to map the evolu- 
tionarily conserved segments between E. burchelli and human 
genomes. Examples of chromosome painting are shown in 
Fig. lc and d and hybridization patterns of all probes are sum¬ 
marized against a G-banded karyotype of E. burchelli (Fig. 4) as 
well as on the human idiogram (Fig. 5). 

The twenty-two human autosomal paints defined 49 con¬ 
served segments in the zebra genome. Paints derived from the 
21 EBU autosomes detected 60 conserved segments in the 
human genome. EBU 16,17 and 18 are each homologous to one 
human chromosomal segment. The remaining 18 EBU autos¬ 
omes correspond to 2-5 homologous segments in the human 
genome. Six human chromosomes (HSA13, 15, 17, 18, 20 and 
21) each correspond to one chromosomal segment in E. bur¬ 
chelli , indicative of complete synteny conservation. 


Pain ting E. z. hartmannae chromosomes with probes from 

E. caballus and E. burchelli 

To establish the genome-wide correspondence among E. z. 
hartmannae, E. burchelli and E. caballus , the complete comple¬ 
ment of E. caballus and E. burchelli painting probes were 
hybridized onto the metaphase chromosomes of E. z. hartman¬ 
nae. Examples of the FISH results are shown in Fig. If and g 
and a summary of the hybridization patterns is presented in 
Fig. 6. The 31 E. caballus autosomal paints revealed 38 homol¬ 
ogous segments in E. z. hartmannae. Seven painting probes 
(ECA2, 3, 4, 5, 6, 8 and 10) each painted two chromosomes or 
chromosomal segments, with the remaining 24 probes corre¬ 
sponding to a single chromosome or chromosomal segment in 
E. z. hartmannae. The painting probes derived from the 21 
autosomal chromosomes of E. burchelli detected 30 conserved 
segments in the E. z. hartmannae genome. In addition, the 
paints derived from ECAX and Y, EBU1, 8 + X, 10 + 12, 13 + 
14, 17-21 and X show strong cross-hybridization to the hetero- 
chromatic regions on EZHXq and Y as well as the telomeric 
regions of several EZH autosomes including EZH1, 2, 4, 7, 8, 
10-12 presumably indicating that these chromosomes share 
similar repeats. 

The integration of hybridization results of human probes 
(Richard et al., 2001) and E. caballus and E. burchelli probes 
(this study) onto E. z. hartmannae, together with reciprocal 
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Fig. 5. Summary of hybridization patterns of 
E. burchelli (EBU) probes on the human idio- 
gram. Although EBU8 and X were sorted togeth¬ 
er, as were EBU 10 and EBU 12, EBU 13 and 14, 
and EBU20 and one homologue of EBU 19 
(Fig. 3), we were able to determine their corre¬ 
spondence to human chromosomes by integrating 
the data from the EBU probes onto human chro¬ 
mosomes and human probes on EBU chromo¬ 
somes (Fig. 4). 






painting data generated for E. burchelli and E. caballus (this 
study), allows for the deduction of one-to-one correspondence 
between the homologous segments conserved in E. caballus, E. 
burchelli and E. z. hartmannae. This does not require further 
reverse painting of E. caballus and E. burchelli chromosomes 
with probes derived from the flow-sorted chromosomes of E. z. 
hartmannae (Fig. 6). Comparisons of G-banding patterns in the 
regions of sequence homology revealed by FISH demonstrates 
that most interspecific homologues are characterized by con¬ 
served banding patterns. Importantly, however, although pre¬ 
vious banding comparison suggested that ECA1 = EBU2 = 
EZH1 (Ryder et al., 1978), our results show that EZH3 (and not 
EZH1) is homologous to ECA1 and EBU2 respectively. 

Discussion 

The completion of the human genome sequencing project 
has made the human genome the standard reference for com¬ 
parative genomic studies of mammals. Additionally, rapid pro¬ 
gress in the horse genome project (Chowdhary et ah, 2003) 


makes this species a useful adjunct for comparative genomic 
and cytogenetic studies of the equids. Among the equids, 
genome-wide comparative maps exist for E. caballus and E. z. 
hartmannae both of which have been established by compara¬ 
tive painting with human painting probes (Raudsepp et al., 
1996; Richard et al., 2001). Additionally, paints derived from 
the twelve E. caballus metacentric autosomes (ECA1-12) and 
the sex chromosomes have been used to investigate the karyo¬ 
typic relationships between the horse and donkey (Raudsepp 
and Chowdhary, 1999). Our study provides the first genome¬ 
wide comparative maps between human and E. burchelli , and 
provides comparative genome maps among E. przewalskii , 
E. caballus , E. burchelli , and E. z. hartmannae. Such genome- 
scale chromosomal correspondences have been impossible 
based on conventional cytogenetic approaches, the only excep¬ 
tion being the karyotypic difference between E. przewalskii and 
E. caballus which involves a single Robertsonian translocation. 
The integration of our comparative chromosome maps with 
those of Raudsepp et al. (1996), Raudsepp and Chowdhary 
(1999) and Richard et al. (2001) sheds new light on the genome 
organization and karyotype evolution of Equidae. 
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Conservation of ancestral eutherian syntenies in the 

Equidae 

Comparative chromosome painting between human and 
representative species of twelve of the extant 18 eutherian 
orders has led to proposals for the composition of ancestral 
karyotypes for eutherian mammals (Murphy et ah, 2001; Yang 
et ah, 2003). Our results demonstrate that in the case of E. bur¬ 
chelli ancestral syntenies equivalent to HSA1, 2pq, 2q, 5, 6, 7a 
( = 7p 11-21 + 7q21 + 7q31-36), 8q, lOq, 9, 11, and 19p have 
each broken into multiple segments, but the ancestral syntenies 
HSAlOp, 13, 17, 18, 20 and X have all been conserved. Of the 
proposed ancestral syntenic associations, 7b (= 7p22 +7qll + 
7q22)/16p, 12q-distal/22q-proximal and 16q/19q have been 
conserved in their entirety while HSA3/21, 4/8p, 12pq-/22q- 
distal and 14/15 have been partially conserved. Similar pat¬ 
terns showing the conservation and disruption of ancestral syn¬ 
tenies have been found in the genome of E. z. hartmannae (Ri¬ 
chard et ah, 2001) and to a lesser extent in E. caballus (Raud- 
sepp et ah, 1996). Notably, however, previous comparative 
painting schemes between human and horse (Raudsepp et al., 
1996) failed to demonstrate the presence of the HSA3/21 and 
4/8 syntenies in the E. caballus genome. Our comparative 
painting results among equid species suggest the retention of 
the HSA3/21 and HSA4/8 syntenies on ECA26 and ECA27, 
respectively. In addition, Raudsepp et al. (1996) reported that 
ECAlp-q = HSA22/10-cen-2/15/12/15/14 while our data indi¬ 
cate that ECAlp-q = HSAlO-cen-1/10/15/14. Interestingly, the 


most recent radiation map shows that ECAlp-q = 22/10-cen- 
15/14 suggesting that final determination will be dependent on 
the development of a high-resolution comparative gene map. 

Chromosomal mechanisms underlying the karyotype 

differences of E. caballus, E. burchelli and E. z. hartmannae 

Our results demonstrate that most of the E. caballus chro¬ 
mosomes have been retained in toto, or as chromosome arms or 
parts of chromosome arms, in the zebras. Exceptions include 
the six E. caballus chromosomes (ECA2, 3, 5, 6, 8 and 10) for 
which the corresponding p and q arms were found on different 
chromosomes in the genomes of E. burchelli and E. z. hartman¬ 
nae. In addition, ECA4 has been conserved in E. burchelli but is 
broken into two segments in E. z. hartmannae. Most of the 
homologous segments shown to be shared among the equid spe¬ 
cies by cross-species painting display conserved banding pat¬ 
terns. Exceptions to this involve the homologues of ECA 1 and 
ECA7. The different morphologies of ECA1 and ECA7 and 
their corresponding homologues in the donkey and zebra spe¬ 
cies suggest the influence of intrachromosomal rearrangement 
such as inversions. Numerous centric fissions, centric fusions 
and tandem fusions, together with a small number of inversions 
underlie the karyotypic differences of the three equid species. 
For instance, the E. z. hartmannae karyotype can be recon¬ 
structed from the E. caballus karyotype through seven centric 
fissions, eleven centric fusions, twelve tandem fusions and at 
least two pericentric inversions. In turn, the karyotype of the 
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E. burchelli can be reconstructed from that of E. caballus 
through six centric fissions, eleven centric fusions, five tandem 
fusions and at least one inversion. It will require six centric fis¬ 
sions, seven centric fusions, nine tandem fusions and at least 
one inversion to reconstruct the E. z. hartmannae karyotype 
from that of the E. burchelli. 

Cytogenetic signatures 

Further analysis of the three human-equid maps, as well as 
of those using the horse chromosomes as reference, reveal sev¬ 
eral cytogenetic changes that may be signatures for certain phy¬ 
logenetic lineages. For example, the FISA 1/10 and 11/19 con¬ 
tiguous combinations are present in all equid species studied 
thus far and are likely cytogenetic signatures for the Equidae. In 
addition, the syntenic association of the HSA5/19 found on 
EBU9p is also present in the pig, Indian muntjac, cattle and 
dolphin. Therefore, the HSA5/19 association appears to be a 
synapomorphy that supports the arrangement of the Cetartio- 
dactyla + Perissodactyla in the Euungulata (true ungulates) and 
in so doing gives additional credence to Waddell’s et al. (2001) 
suggestion that this is a natural grouping. Similarly, the synten¬ 
ic association ECA2q/3q is present in the zebras and donkey 
suggesting that this may be a synapomorphy uniting these 
lineages. This finding is consistent with their grouping as sister 
clades in a Maximum likelihood tree based on mtDNA control 
region and 12 rRNA sequences (Oakenfull et al., 2000). The 
ECA6q/25/16, 2p/15, 4p/31/, 3p/10p, 6p/12, and 8p/20 asso¬ 
ciations appear to be unique (an autapomorphy) to the zebras. 

Ancestral karyotype and phylogeny of the Equidae 

An ultimate aim of many comparative cytogenetic and 
genomic studies is to reconstruct the ancestral karyotype and 


develop a karyotypic phylogeny for species of interest (Neusser 
et al., 2001; Nie et al., 2002). To achieve this, full taxon repre¬ 
sentation is useful and appropriate outgroup comparisons criti¬ 
cal, since it is only by documenting primitive and derived char¬ 
acter states within the karyotypes along cladistic principles that 
determining the magnitude of change that has occurred within 
each lineage becomes possible. In the case of the Equidae, how¬ 
ever, comparable data are lacking for E. greyvi , E. kiang , and E. 
onager [species recognition follows Nowak (1999)]. The most 
closely related outgroup species are the tapirs, followed by the 
rhinoceroses (Tougard et al., 2001), all of which have relatively 
high, or high diploid numbers (tapirs 2n = 52-80, Houck et al., 
2000; rhinoceroses 2n = 82-84, Houck et al., 1994; Trifonov et 
al., in press). The data from comparative chromosome painting 
among five equid species (i.e. the ECA-EAS comparison, Raud- 
sepp and Chowdhary, 1999; Yang et al., submitted; the ECA- 
EPR-EBU-EZH comparison, this study) demonstrate that 
these five equid species share 37 evolutionary conserved seg¬ 
ments equivalent to ECA1, 2p, 2q, 3p, 3q, 4p, 4q, 5p, 5q, 6p, 
6q, 7, 8p, 8q, 9, lOp, lOq and 11-31. We believe that these 37 
conserved segments originated during the divergence of the 
common ancestor of the modern equid species and that the 
ancestral karyotype of the Equidae is likely to have had a high 
diploid number. The nearly random distribution of the 37 con¬ 
served segments in the E. burchelli and E. z. hartmannae 
genomes suggest that independent fusion (centric fusions and 
tandem fusions) combinations of the 37 ancestral segments 
gave rise to the karyotypes of the extant equid species. Hybridi¬ 
zation of the E. caballus paints onto the uncharted genomes of 
the E. greyvi , E. kiang and E. onager , as well to the rhinoceros 
and tapirs, should finally allow for the development of a well- 
resolved chromosomal phylogeny for the extant equids. 
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Abstract. There is incredible morphological and behavioral 
diversity among the hundreds of breeds of the domestic dog, 
Canis familiaris. Many of these breeds have come into exis¬ 
tence within the last few hundred years. While there are 
obvious phenotypic differences among breeds, there is marked 
interbreed genetic homogeneity. Thus, study of canine genetics 
and genomics is of importance to comparative genomics, evo¬ 
lutionary biology and study of human hereditary diseases. The 
most recent version of the map of the canine genome is com¬ 
prised of 3,270 markers mapped to 3,021 unique positions with 
an average intermarker distance of ~ 1 Mb. The markers 
include approximately 1,600 microsatellite markers, about 


1,000 gene-based markers, and almost 700 bacterial artificial 
chromosome-end markers. Importantly, integration of radia¬ 
tion hybrid and linkage maps has greatly enhanced the utility of 
the map. Additionally, mapping the genome has led directly to 
characterization of microsatellite markers ideal for whole ge¬ 
nome linkage scans. Thus, workers are now able to exploit the 
canine genome for a wide variety of genetic studies. Finally, the 
decision to sequence the canine genome highlights the dog’s 
evolutionary and physiologic position between the mouse and 
human and its importance as a model for study of mammalian 
genetics and human hereditary diseases. 

Copyright©2003 S. Karger AG, Basel 


The domestic dog has occupied a unique position in human 
lives for many centuries, serving many roles, including: war¬ 
rior, shepherd, guide, retriever, hunter, and companion. Clear¬ 
ly, the dog has many notable behavioral and physical character¬ 
istics that have been valued over thousands of years. However, 
the greatest benefit of the dog may yet to be realized: its contri¬ 
bution to genetics. 

Recent data suggest the domestic dog’s origin was in East 
Asia (Savolainen et al., 2002). Specifically, the common origin 
of New and Old World dogs from gray wolves is demonstrated 
by mtDNA evidence dating the event to approximately 15,000 
to 40,000 years ago (Leonard et al., 2002). Since its domestica¬ 
tion, specific breeds have been developed with distinctly differ¬ 
ent appearances and behavioral traits. The rapid origin of 
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breeds has been based on inbreeding a genetic pool composed 
of a limited founder number (Ostrander and Kruglyak, 2000). 
However, the price of limited genetic heterogeneity has been 
the emergence of more than 450 naturally occurring hereditary 
diseases. The small founding populations, high levels of in- 
breeding and creation/maintenance of multigenerational pedi¬ 
grees have resulted in a population ideal for genetic studies 
because often only a single mutated allele is responsible for 
these diseases (Ostrander and Kruglyak, 2000). Furthermore, 
almost half of the hereditary diseases of the dog have an analo¬ 
gous disease in the human (OMIA, 2003). Thus, the dog serves 
as a natural model for study of these diseases and obviates the 
need to construct knockout models for study of such diseases. 

Current status of the canine genome 

Mammalian genome comparisons 

Information regarding the canine genome is significantly 
less than that for the murine and human genomes. Determina¬ 
tion of the sequences of the human and murine genomes con- 
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firmed syntenic conservation of genetic loci by pairwise align¬ 
ment of nearly 13,000 orthologous gene pairs. However, the 
median amino acid sequence identity of mouse to human is 
only 78.5 % (Boguski, 2002). Thus, human and murine compar¬ 
isons may not be sufficient to complete our understanding of 
mammalian gene function and evolution. The Mouse Genome 
Consortium recently noted that many additional mammalian 
genome sequences would be necessary to fully decipher infor¬ 
mation from the human genome sequence. This acknowledge¬ 
ment results from the recognition that gene family changes 
between the murine and human genomes may represent physi¬ 
ological divergence from the common ancestor of the mouse 
and human (Waterston, 2002). Therefore, because the dog is 
more closely related, evolutionarily and physiologically, to the 
human than is the mouse, it is an ideal model for comparative 
genetics and will complement information gleaned from se¬ 
quences of other genomes. 

Mapping the canine genome 

The dog has 38 autosomes, plus the sex chromosomes. 
Unfortunately, acrocentricity and similarity in the sizes of 
autosomes complicated standardization of the canine karyo¬ 
type. An international collaboration, DogMap, was established 
in 1993 to create a low-resolution marker map, and to stan¬ 
dardize the karyotype. In 1997, sixteen linkage groups were 
derived from the analysis of 94 polymorphic loci (Lingaas et al., 
1997). Shortly thereafter, characterization of 150 microsatellite 
markers and 30 additional linkage groups provided the first 
meiotic linkage map (Mellersh et al., 1997), and a framework 
for development of the 10-cM map (Neff et al., 1999). A map of 
341 markers distributed at an average of 9 cM soon followed 
(Werner et al., 1999). 

Distinguishing the canine chromosomes with the use of 
whole chromosome-specific fluorescent in situ hybridization 
paint probes (Breen et al., 1999), the construction of a canine 
bacterial artificial chromosome (BAC) library (Li et al., 1999), 
and construction of a radiation hybrid (RH) panel accelerated 
development of a high-resolution map of the canine genome. 
An initial RH panel was constructed (Priatt et al., 1998), and 
updated by a whole genome RH panel (Vignaux et al., 1999). 
Mellersh and colleagues (2000) integrated the linkage and RH 
maps, a major advance in study of the canine genome. Maps 
were integrated through duplicate typing of 217 markers, gener¬ 
ating an integrated map of 724 unique markers suitable for 
both linkage analysis and comparative mapping. Importantly, 
integration allowed for identification of markers useful in 
whole genome linkage scans. This subset of markers, termed 
the Minimal Screening Set -1 (MSS-1) (Richman et al., 2001), 
has been multiplexed to streamline linkage analyses (Cargill et 
al., 2002). Estimates of genome coverage suggest that 77% of 
the genome is within 1 Mb of at least one marker in the MSS-1 
(Richman et al., 2001). The most recent RH panel, 
RHDF5000-2, was used to map 3,270 markers to 3,021 unique 
positions with an average intermarker distance corresponding 
to ~ 1 Mb (Guyon et al., 2003). The 1 Mb RH map facilitated 
characterization of a more comprehensive screening set, the 
MSS-2, composed of 325 microsatellite markers with an aver¬ 
age spacing of 9 Mb (Guyon et al., 2003). 


Sequencing the canine genome 

The importance of the dog as a model for many human 
hereditary diseases is a major impetus for sequencing the 
canine genome. Celera (Rockville, MD) conducted the first 
large-scale canine sequencing effort with DNA from a male 
Standard Poodle. Access to the sequence (coverage is lx) is 
available through collaborations with The Institute for Genome 
Research. Work using this resource is currently underway and 
one aspect of this study is directed towards mapping indepen¬ 
dent gene sequences (White Paper Proposal for Sequencing the 
Canine Genome, 2002). Because this sequence has consider¬ 
able gaps, the need for a more complete sequence has been rec¬ 
ognized as high priority by the National Institutes of Health 
(http://www.nih.gov/news/pr/sep2002/nhgri-12.htm). Sequenc¬ 
ing (beginning June 2003) of the genome, using DNA isolated 
from a Boxer (http://www.genome.gOv/l 1007358), will be per¬ 
formed at the Whitehead Genome Center in a manner similar 
to that utilized in sequencing of the murine genome. By utiliz¬ 
ing different library sizes cloning bias will be minimized, there¬ 
by allowing a hierarchical approach for sequence assembly. 
This method will provide approximately 6-fold sequence cover¬ 
age and 50-fold physical coverage. Other sequencing efforts 
include end-sequencing of clones retrieved from the BAC libra¬ 
ry. More specifically, sequences have been obtained from 
approximately 1,500 BAC clones and 668 of these are mapped, 
yielding an initial framework for future mapping (Guyon et al., 
2003). 

A second valuable resource for study of the canine genome 
is the development of a single nucleotide polymorphism (SNP) 
map. Recent results indicate that interbreed sequence compari¬ 
sons are a reasonable strategy for identifying useful SNPs with¬ 
in many breeds (Brouillette and Venta, 2002). Even so, it must 
be noted that an SNP map which will facilitate high throughput 
screening of the genome (10 kb spacing) will require the identi¬ 
fication of 250,000 SNPs (White Paper Proposal for Sequenc¬ 
ing the Canine Genome, 2002). Additionally, efforts are being 
made to construct normalized or subtracted cDNA libraries, 
and to characterize expressed sequence tags (ESTs). McCombie 
and others (Table 1) maintain an interactive website for access 
to current canine EST projects (http://www.cshl.org/genbin/cgi- 
bin/golden_retriever.cgi), listing several canine cDNA libraries. 
Lastly, the canine genome sequencing efforts will be comple¬ 
mented by expression profiling in canine models of human dis¬ 
eases and such work is underway through various collabora¬ 
tions (http://crisp.cit.nih.gov/crisp). 

Canine diseases and genetic analyses 

As progress is made in constructing maps and sequencing of 
the canine genome, more mutation based and linkage based 
genetic tests will be developed to allow detection of deleterious 
alleles. While mutation based tests allow guaranteed results 
because the specific mutation has been defined, linkage based 
tests do not provide such certainty due to recombination events 
that may cause false negative and positive results. 
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Table 1. Overview of current canine genomic information 


Chromosome number (2n) 

78 

Genome size 

2.8 x 10 9 bp 

Meiotic linkage map 

microsatellites 

354 

Radiation hybrid map 3 

microsatellites 

1,596 

gene sequences + ESTs 

900 

BAC end sequences 

668 

sequence tag sites (STS) 

106 

Integrated linkage and RH map b 

common to cytogenetic and RH 

102 

common to RH and meiotic linkage 

251 

common to cytogenetic and meiotic linkage 

52 

BAC library 

8.1-fold coverage 

BAC end sequencing 3 

1,504 

BAC end sequence containing microsatellites 3 

39 

ESTs c 

19,465 

SNPs b 

78 

STS b 

200 


Numbers reflect those reported in 2003 (Guyon et al.), and discovery of SNPs, 
STSs, and BAC end sequences continues in numerous laboratories. 
b As reported by FHCRC (www.fhcrc.org/science/dog_genome/dog.html). 
c EST development is ongoing and referenced number is current as of submission 
date, with R. McCombie and A. George having submitted the majority of EST 
sequences to GenBank. 


Candidate gene analyses 

The candidate gene approach uses analysis of pedigrees and 
phenotypes together with comparison of analogous diseases in 
other species to identify candidate genes that may be responsi¬ 
ble for a given disease. An excellent example of this is the iden¬ 
tification of the gene causative for autosomal dominant pro¬ 
gressive retinal atrophy in the English Mastiff. After determin¬ 
ing the mode of inheritance by controlled outcross matings, 
Kijas and coworkers (2003) selected three genes responsible for 
approximately half of all human autosomal dominant retinal 
disease (www.sph.uth.tmc.edu/Retnet) for evaluation in the 
dog. Their analysis revealed that the rhodopsin gene contained 
a C -» G transversion, resulting in an altered amino acid within 
the protein. Association testing followed, and confirmed that 
the rhodopsin mutation was responsible for this form of pro¬ 
gressive retinal atrophy in the English Mastiff. 

Another recent candidate gene investigation revealed the 
mutation responsible for X-linked Alport syndrome (XLAS) in 
a group of mixed-breed dogs. Alport syndrome is a chronic, 
progressive glomerulonephritis characterized by ultrastructural 
changes in the glomerular basement membrane. The most com¬ 
mon form of AS in the human is XLAS and is due to mutations 
in COL4A5 (Jais et al., 2000). A colony of mixed-breed dogs 
segregating naturally occurring XLAS was studied and found to 
exhibit clinical, immunohistological, and ultrastructural char¬ 
acteristics virtually identical to that seen in human XLAS (Lees 
et al., 1999). Therefore, full-length COL4A5 cDNAs from nor¬ 
mal and affected dogs were sequenced. This work revealed a 
10-bp deletion in exon 9, resulting in a frameshift which dis¬ 
rupts the collagenous region of the protein, causing a premature 
stop codon within exon 10. This study also led to development 
of a mutation based genetic test for XLAS (Cox et al., 2003). 


Other candidate gene analyses include the study of dilated 
cardiomyopathy (DCM) in the Doberman Pinscher. DCM pro¬ 
vides an example of the difficulties geneticists sometimes face 
in studying canine hereditary diseases. That is, DCM is a lethal, 
late onset disease (average age of is about six years) and this 
complicates obtaining DNA samples for linkage analysis. The 
possibility of phenocopies adds to the difficulties in analysis. 
Lurthermore, the mode of inheritance is uncertain although the 
likely presence of a founder effect suggests a major gene is nec¬ 
essary for a dog to be affected. Several DCM-causing genes 
have been identified in human families and are candidate genes 
for canine DCM. In the human, every gene identified to date 
accounts for a small proportion of heritable DCM, suggesting 
that any number of genes could be responsible for canine DCM 
(Schonberger and Seidman, 2001). Progress is being made by 
eliminating many of these candidate genes as causative (e.g., 
Meurs et al., 2001; Venta et al., unpublished results), but new 
approaches to identify genes for complex diseases are needed. 

Linkage analyses 

Linkage studies rely upon informative pedigrees that have 
affected and non-affected individuals available for sampling. 
Generally, such extended pedigrees are not available or accessi¬ 
ble for human studies, but are common in the dog. Another 
advantage of using the dog in genetic studies is that the problem 
of locus heterogeneity inherent to human linkage studies is 
reduced. This is due to founder effects and minimal genetic 
diversity in breeds that have arisen from small populations. 

Some conditions (e.g., deafness) are not amenable to the 
candidate gene approach since too many candidates exist. 
Thus, pedigree assembly for linkage analysis is necessary. 
While it is known that a relationship exists between hereditary 
hearing loss and pigmentation in the Dalmatian, the nature of 
this association is unclear. Even so, candidate gene studies have 
eliminated a few genes such as PAX3 (Brenig et al., 2003), but 
the number of remaining candidate genes is quite large. There¬ 
fore, linkage analysis is being performed by Muhle et al. (2002) 
who have assembled a kindred of Swiss Dalmatians and Cargill 
et al. (personal communication), who have assembled a 
kindred of US Dalmatians. 

The utility of linkage analysis is exemplified by the study of 
hereditary multifocal renal cystadenocarcinoma and nodular 
dermatofibrosis (RCND), a rare cancer in the German Shep¬ 
herd Dog. RCND is a spontaneous, inherited disease with sev¬ 
eral human correlates, albeit none that match all facets of the 
disease exactly. Jobasdottir and colleagues (2000) propagated a 
colony segregating RCND and used the MSS-1 screening panel 
for linkage analysis. One microsatellite marker yielded a lod 
score of 16.7 and the RCND locus was mapped to canine chro¬ 
mosome 5. Within the targeted region are several attractive 
candidate genes and future work will likely identify the gene 
responsible for RCND. 

Linkage analysis is also being used to identify loci contribut¬ 
ing to canine hip dysplasia (CHD). CHD is a degenerative dis¬ 
ease, primarily in large breeds, that results in part, from 
increased joint laxity. Lunctional subluxation of the hip, poor- 
hip joint congruity, and debilitating secondary osteoarthritis 
are hallmarks of CHD. Although the mode of inheritance is 
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unknown, recent investigations offer heritability estimates 
ranging from 0.11 to 0.68 (Bliss et al., 2002). An outcrossed 
pedigree of Labrador Retrievers and Greyhounds has been con¬ 
structed for use in linkage studies of CHD and is currently 
being analyzed by a genome-wide screen (Todhunter et al., 
2003). 

Linkage disequilibrium 

Linkage disequilibrium (LD) mapping relies upon popula¬ 
tion based gene and marker frequencies considered to be in dis¬ 
equilibrium if the marker and allele do not assort independent¬ 
ly within the population. In LD, the marker and allele segregate 
together at a greater frequency than would be expected by their 
individual frequencies. This type of association would be 
expected for genes and markers in a population with common 
ancestral origin. Many breeds are derived from very small 
founding populations and gene flow between breeds is re¬ 
stricted by registration requirements. Thus, LD is a valuable 
tool for identification of genes responsible for hereditary dis¬ 
eases and other traits of interest. 

Evidence for the utility of LD was presented by Ostrander 
and Kruglyak (2000) through analysis of cancers in the Golden 
Retriever. The estimated risk of malignancy for the Golden 
Retriever is significantly higher than that for other breeds 
(Priester and McKay, 1980). If the genes for malignancy are 
spread by a popular sire, LD would be expected over large dis¬ 
tances (Ostrander et al., 2000). The calculations estimate that 
screening of ten affected individuals has adequate power for 
detection of regions with genetic homogeneity through identity 
by descent, as would be expected within a single line of dogs. If, 
however, dogs from various breeding lines are needed, it is esti¬ 
mated that only 40 dogs would be necessary for screening based 
upon conditions of moderate heterogeneity (Houwen et al., 
1994). In this case, it would be reasonable to assume dogs with¬ 
in a breed carry the same mutation due to a common ancestral 
origin; therefore, the overlap of genetic regions revealed by LD 
would narrow the genomic region containing the causative gene 
(Ostrander et al., 2000). 

A current study using LD concerns canine sebaceous adeni¬ 
tis (SA), an inflammatory disease of the skin characterized by 
hair loss and unpleasant odor, making affected animals far 
from desirable pets. SA is archetypical of a number of heritable 
canine dermatological diseases in that 1) there are a few breeds 
with high prevalence but many breeds have sporadic occur¬ 
rence, 2) there often is an inability to definitively call a dog 
“normal” based on phenotype, 3) onset often occurs during 
maturity and 4) there are no plausible candidate genes. These 
findings preclude a candidate gene approach or traditional link¬ 
age studies. SA has been recognized in over 55 breeds and mon¬ 
grels but the Standard Poodle and Akita have the highest preva¬ 
lence. Analysis of both breeds suggests SA is an autosomal 
recessive trait. However, there are more normal dogs than 
expected for an autosomal recessive disease due to subclinically 
affected dogs and dogs with late-onset disease (Rosser et al., 
1987; Dunstan and Hargis, 1995; Scott et al., 2000; Reichler et 
al., 2001). The Akita (rather than the Standard Poodle) is cur¬ 
rently being studied by several groups pursuing association 
studies to define LD because SA in this breed fits the criteria 


established in the human for using association studies (Houwen 
et al., 1994; Kruglyak, 1997; De La Chapelle and Wright, 1998; 
Elston, 1998). 

Future of canine genomics 

Clearly there has been an explosion in knowledge of canine 
genetics and the canine genome. This new information is aug¬ 
menting studies of human hereditary diseases, comparative 
genomics and clinical veterinary medicine, yet the full impact 
of the new information remains to be seen. The primary bene¬ 
fits to be derived from study of canine genetics are 1) the addi¬ 
tion of comparative genomic information regarding a species 
evolutionarily and physiologically closely related to the human, 
2) a better understanding of canine hereditary diseases and sub¬ 
sequent elimination of many disease alleles from breeding pop¬ 
ulations and, 3) insight into analogous human hereditary dis¬ 
eases and implementation of therapies, including gene therapy 
protocols. The importance of the latter is illustrated by success 
in gene therapy for two different diseases. Firstly, gene therapy 
prevents clinical manifestations of a lysosomal storage disease 
in a canine model of the disease (Ponder et al., 2002). Secondly, 
a form of early childhood blindness, Leber congenital amauro¬ 
sis, has been corrected by gene therapy in a spontaneous canine 
model of the disease (Acland et al., 2001). These examples and 
others are indicative of the power of the dog as a large animal 
model of human hereditary diseases. The similarity between 
orthologous canine and human genes ensures that analogous 
human disease genes will be identified. In conclusion, research 
in canine genetics and genomics is benefiting the human as well 
as the dog, our companion of many centuries. 
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Abstract. An extensive number of genes have been impli¬ 
cated in the initiation and progression of human cancers, aid¬ 
ing our understanding of the genetic aetiology of this highly het¬ 
erogeneous disease. In order to facilitate extrapolation of such 
information between species, we have isolated and physically 
mapped the canine orthologues of 25 well-characterised human 
cancer-related genes. The identity of PCR products repre¬ 
senting each canine gene marker was first confirmed by DNA 
sequencing analysis. Each product was then radiolabelled and 
used to screen a genomic BAC library for the domestic dog. The 
chromosomal location of each positive clone in the canine 
karyotype was determined by fluorescence in situ hybridisation 
(FISH) onto canine metaphase preparations. Of the 25 genes, 


the FISH localisation of 21 correlated fully with that expected 
on the basis of known regions of conserved synteny between the 
human and canine genomes. Three correlated less closely, and 
the chromosomal location of the remaining marker showed no 
apparent correlation with current comparative mapping data. 
In addition to generating useful comparative mapping informa¬ 
tion, this panel of markers will act as a valuable resource for 
detailed study of candidate genes likely to be involved in 
tumourigenesis, and also forms the basis of a canine cancer- 
gene genomic microarray currently being developed for the 
study of unbalanced genomic aberrations in canine tumours. 

Copyright©2003 S. Karger AG, Basel 


It is well established that certain human genes play a key 
role in the initiation and progression of human tumours. An 
increasing number of proto-oncogenes and tumour-suppressor 
genes are being identified throughout the human genome for 
which gain and loss of function respectively have the effect of 
disrupting the normal cell cycle and leading to uncontrolled cell 
proliferation. Such markers act as a valuable resource of data 
for the study of genetic aberrations in malignant cells, and for 
gaining a detailed understanding of how cell cycle pathways 
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become disrupted in the development of a tumour. In turn, 
improved methods of diagnosis and therapy, making use of 
this improved understanding, will undoubtedly continue to 
emerge. 

An increasing interest in a comparative approach to cancer 
studies is developing as a result of recognition of the degree to 
which both clinical and laboratory-based findings in cancer can 
be correlated between different species, for mutual benefit. The 
laboratory rodent has frequently been used as a model for 
human cancers, but the promise provided by comparative stud¬ 
ies with other species, notably the domestic dog, is now well 
recognised. A number of human and canine malignancies are 
highly similar in their clinical presentation and biology, and 
these species also share an extensive degree of genomic conser¬ 
vation as well as a common physical environment. Reciprocal 
molecular cytogenetic studies (“Zoo-FISH analysis”) of human 
(HSA) and dog (CFA) chromosomes have revealed regions of 
highly conserved synteny between their genomes (Breen et al., 
1999a; Yang et al., 1999; Sargan et al., 2000), which allows 
chromosome regions sharing a common genetic ancestry to be 
directly compared. The assimilation of these comparative data 
with information from cytogenetic, radiation-hybrid and 
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meiotic linkage mapping has led to the development of highly 
detailed integrated maps of the domestic dog genome (Breen et 
ah, 2001; Guyon et ah, 2003). Consequently, the physical loca¬ 
tion of any gene marker that has been assigned accurately in 
one species can now be predicted in the other with a high degree 
of confidence. This leads to the hypothesis that the same may 
apply to the recurrent genomic aberrations that have been char¬ 
acterised in a number of human cancers. Studies to date have 
confirmed that recurrent chromosome aberrations do indeed 
occur in the dog (Dunn et al., 2000; Thomas et al., 2001a, 2003) 
and there is tentative evidence to suggest that they may share a 
common genetic basis with aberrations reported in the corre¬ 
sponding human disease (Thomas et al., 2003). 

In contrast to human studies, to date few canine cancer- 
related genes have been mapped, and fewer still have been stud¬ 
ied in extensive detail. Recognising the potential value of such 
a resource, we have used a combination of new and existing 
genomic information to isolate BAC (bacterial artificial chro¬ 
mosome) clones representing the canine orthologue of 25 can¬ 
cer-related genes. We report PCR amplification conditions and 
partial sequence data for each gene marker, together with the 
canine chromosomal location as determined by fluorescence in 
situ hybridisation (FISH) analysis. We also outline a number of 
potential applications of this resource. 

Materials and methods 

Optimisation of amplification conditions and DNA sequencing analysis 

The loci described in the present study were selected from the extensive 
list of human genes known to be involved in tumour initiation and progres¬ 
sion. The 25 markers for which canine orthologues were isolated represent 
those for which sufficient canine and/or other nucleotide sequence data were 
available at that time for the generation of a sequence tagged site. Primers for 
PCR amplification of partial canine gene sequences were identified from 
existing publications, designed to freely-available nucleotide data, or were 
generated from novel sequence information derived from several sources 
(Table 1). Amplification reactions were performed using a PTC-225 thermal 
cycler (MJ Research), in lO-pl volumes comprising 15 pmol of each PCR 
primer, 25 ng of canine genomic DNA, 1 U AmpliTaq Gold (Perkin Elmer), 
0.2 mM dNTPs (Pharmacia) and lx AmpliTaq Gold PCR buffer II (Perkin 
Elmer). Conditions were first optimised for PCR amplification of each mark¬ 
er locus from canine genomic DNA by variation of primer-template anneal¬ 
ing temperature (50-60 °C), extension time (1-2 min) and MgCh concentra¬ 
tion (1-4 mM). In each case, cycling commenced with a 10-min denatur- 
ation/enzyme activation step at 95 ° C followed by 30 cycles of 94 ° C (1 min), 
optimised annealing temperature (one min), and 72 °C (1 min, or 2 min for 
markers in excess of 1 kb), finishing with a final extension step (72 °C, 
5 min). Where possible conditions were standardised to a 60 °C annealing 
temperature and a final concentration of 1.5 mM MgCfe. 

Approximately 50 ng of each PCR product were sequenced using BigDye 
dye terminator chemistry and analysed on an ABI 3700 DNA sequencer. 
Sequence information was edited manually to remove regions of poor quality 
data at the beginning and end of the read and to resolve base-calling ambigu¬ 
ities where possible. The identity of each product was confirmed by compari¬ 
son with existing nucleotide sequence data using the BLASTN search tool 
(http://www.ncbi.nlm.nih.gov/blast/) with default parameters (Altschul et al., 
1997). In general, significant database matches were regarded as those gener¬ 
ating a minimum of 85 % nucleotide identity over a region of no fewer than 
60 bp. BLASTN data for each marker were also evaluated manually to inves¬ 
tigate similarities reported with lower levels of significance. 

Isolation of corresponding canine BAC clones 

Radiolabelled PCR products representing each canine gene marker were 
used to screen groups of three filters selected from the RPCI-81 canine 


genomic BAC library (Li et al., 1999) as previously described (Thomas et al., 
1999). Clones generating positive hybridisation signals were obtained as bac¬ 
terial slants from the Human Genome Mapping Project Resource Centre 
(HGMP-RC, Babraham, Cambs, UK). Each clone was verified initially by 
PCR amplification directly from a crude bacterial suspension, using the 
appropriate primers to confirm that a single product of the expected size was 
produced. BAC DNA was then isolated using a standard alkaline lysis meth¬ 
od and was used for sequence confirmation and FISH mapping. 

Chromosome assignment of canine cancer-related gene markers 

Canine BAC clones were physically mapped by FISH analysis as pre¬ 
viously described (Thomas et al., 1999). Briefly, 500 ng of BAC DNA were 
labelled with either Spectrum Red-, Orange- or Green-dUTP (Vysis), diethy- 
laminomethylcoumarin (DEAC)-5-dUTP (NEN), or biotin-16-dUTP (Boeh- 
ringer) by nick translation. Typically, 25 ng of each of five differentially- 
labelled probes were pooled into one overnight hybridisation reaction, in the 
presence of 15 pg of sonicated dog genomic DNA as competitor. Biotinylated 
probes were detected with Cy5-conjugated avidin (4 pg/ml, Amersham Phar¬ 
macia). A minimum of 10 canine metaphase spreads were analysed and chro¬ 
mosome assignments were made according to the nomenclature of Breen et 
al. (1999b). Where necessary, chromosome assignment was confirmed by co¬ 
hybridisation with a probe selected from a panel of 41 chromosome-specific 
canine BAC markers described elsewhere (Thomas et al., 2003; http://www. 
cvm.ncsu.edu/mbs/breen_matthew.htm). Canine marker assignments were 
then compared to the chromosomal location of the human orthologue of the 
corresponding gene (Table 1). 


Results 

Each of the 25 markers described in Table 1 was successful¬ 
ly optimised for the PCR and used in DNA sequencing analy¬ 
sis. A database similarity search using BLASTN under the cri¬ 
teria described above confirmed the identity of 22 marker 
products. The degree of identity ranged from 88 to 100%, over 
regions ranging between 64 and 404 nucleotides. The most sig¬ 
nificant database matches were observed predominantly with 
either previously annotated DNA or mRNA data for the corre¬ 
sponding human gene (REL, FES, FOS, NF1, MDM2, RAF1, 
RBI, ERBB2), or with existing DNA or mRNA sequence for 
the canine gene where those data are available (MYB, SAS, 
PAX3, KIT, BRCA1, TP53, WT1, MYC, KRAS, TSC2, 
HRAS). The most significant matches for RET, CDK4 and 
PDGFB were observed with data obtained from the horse, pig 
and domestic cat respectively. Data for NRAS, YES1 and 
INSR fell marginally outside the criteria required to confirm a 
database match, demonstrating 95-98% identity with the hu¬ 
man gene over regions of between 41 and 57 nucleotides, but on 
manual examination of all data these were considered to repre¬ 
sent true matches. All sequence data generated in this study 
have been deposited into the GenBank database (accession 
numbers AJ563727 to AJ563748). 

In two instances (NRAS, PAX3), sequence data from the 
PCR product generated using the original marker primers were 
subsequently used to design new internal primers, to produce a 
shorter and more robust product more suited to PCR amplifi¬ 
cation and BAC library screening. The original NF1 and WT1 
markers (Venta et al., 1996) both contained a repetitive micro- 
satellite-like sequence, and new primers were designed to avoid 
this region to prevent interference with BAC library screening. 
Sequence data generated using these primers confirmed amplif¬ 
ication of the expected product. The origin of PCR primers or 
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Table 1 . The symbol, description and human chromosome assignment of each gene locus was taken from the Human Gene Nomenclature Database 
(http://ash.gene.ucl.ac.uk/nomenclature/). Dog chromosome assignments were derived in the present study. 


Gene Description 


Human gene assignment Canine gene assignment 


NRAS 

REL 

MYB 

SAS 

FES 

YES1 

INSR 

PAX 3 

KIT 

RET 

FOS 

BRCA1 

TP53 

WT1 

NF1 

MYC 

CDK4 

MDM2 

KRAS 

RAF1 

TSC2 

HRAS 

PDGFB 

RBI 

ERBB2 


neuroblastoma RAS viral (v-ras) oncogene homolog 
v-rel reticuloendotheliosis viral oncogene homolog (avian) 
v-myb myeloblastosis viral oncogene homolog (avian) 
sarcoma amplified sequence 
feline sarcoma oncogene 

v-yes-1 Yamaguchi sarcoma viral oncogene homolog 1 
insulin receptor 

paired box gene 3 (Waardenburg syndrome 1) 

v-kit Hardy-Zuckerman 4 feline sarcoma viral oncogene homolog 

ret proto-oncogene (multiple endocrine neoplasia and medullary thyroid carcinoma 1, Hirschsprung 
disease) 

v-fos FBJ murine osteosarcoma viral oncogene homolog 

breast cancer 1, early onset 

tumor protein p53 (Li-Fraumeni syndrome) 

Wilms tumor 1 

neurofibromin 1 (neurofibromatosis, von Recklinghausen disease, Watson disease) 
v-myc myelocytomatosis viral oncogene homolog (avian) 
cyclin-dependent kinase 4 

Mdm2, transformed 3T3 cell double minute 2, p53 binding protein (mouse) 
v-Ki-ras2 Kirsten rat sarcoma 2 viral oncogene homolog 
v-raf-1 murine leukemia viral oncogene homolog 1 
tuberous sclerosis 2 

v-Ha-ras Harvey rat sarcoma viral oncogene homolog 

platelet-derived growth factor beta polypeptide (simian sarcoma viral (v-sis) oncogene homolog) 
retinoblastoma 1 (including osteosarcoma) 

v-erb-b2 erythroblastic leukemia viral oncogene homolog 2, neuro/glioblastoma derived oncogene 
homolog (avian) 


I p 13 

2p13—p12 
6q22-q23 
12q13—q14 
15q25-qter 
18pl 1.31-pl 1.21 
19pl3.3-pl3.2 
2q35-q37 
4ql1—q12 
10qll.2 

14q24.3 
17q21-q24 
17p 13.1 

II p 13 

17q 11.2 
8q24 
12q 13 
12q13—q1 4 

12p 12.1 

3p25 
16p 13.3 
1 lpl5.5 
22ql2.3-ql3.1 
13q 14.2 
17q 11.2—q 12 


17q22-q23 
10q27 
1 q 14 

10ql2-ql4prox 

3q22.1dist-q22.3 

7q22-q24 

20ql6 

3 7q16—q17 

13q21.3 

28q 12 

8q31 

9q21.2 

5q21 

18q22.2-q22.3 

9q23 

13ql2.3-ql3prox 
15q24.3 
31 q 12 
27q 13 

20ql ldist-ql2prox 

6q21.2-q21.3 

18q22.1 

10q23 

22q 11.2 

9q21.3dist 


DNA sequence data for each gene marker are given in Table 1, 
with details of the optimised PCR conditions and observed 
product sizes. 

Between one and four BAC clones were isolated for each 
gene marker. Each was cytogenetically assigned and demon¬ 
strated a unique hybridisation site on a single chromosome 
pair. Thirteen of the 25 genes were represented by multiple 
BAC clones. Where duplicate clones were isolated, each map¬ 
ped to the same chromosomal location. The address and chro¬ 
mosome location of each BAC clone isolated and the location 
of the human orthologue, is given in Table 2. 

Discussion 

Of the 25 canine genes mapped in the present study, the 
assignments of 21 genes correlate fully with that expected on 
the basis of known dog-human comparative genome data 
(Breen et al., 1999a; Yang et al., 1999; Sargan et al., 2000). 
Assignments of a further three genes (RET, CDK4 and HRAS) 
do not correlate precisely, but suggest that the locations and 
boundaries of regions of conserved synteny identified in these 
prior studies might become further modified as additional data 
increase the resolution of comparative mapping information. 
The assignment of canine MDM2 markers appeared to show no 
correlation with existing dog-human comparative data. Human 
MDM2 lies at HSA12ql3^ql4. Comparative cytogenetic 
data would place the canine orthologue on dog chromosome 10 
(CFA10) or CFA27, according to Breen et al. (1999a, 2001) and 
Sargan et al. (2000) respectively. In the present study, two 


clones representing MDM2 were both assigned to CFA31ql2, 
which is evolutionarily related to HSA21q22. 

To our knowledge, few of the genes described in the present 
study have previously been mapped cytogenetically. We have 
previously reported the mapping of canine FOS and TP53 
(Thomas et al., 2001b, 2001c). Canine BRCA1 and NF1 were 
assigned to CFA9q21 and 9q24 by Werner et al. (1997), by 
probing canine bacteriophage and cosmid libraries with human 
cDNA clones to identify positive clones and assigning by FISH 
analysis. These assignments correlate with data from the 
present study, allowing for the use of a slightly different chro¬ 
mosome banding nomenclature in the two studies. The assign¬ 
ment of the canine PAX3 gene to CFA37ql6—>q27 correlates 
with an existing report by Krempler et al. (2000), in which BAC 
clone 300j24, isolated using a murine PAX3 cDNA clone as a 
probe for the RPCI-81 canine BAC library, was assigned to the 
same site. 

Murua Escobar et al. (2001) have previously reported the 
isolation and assignment of a putative canine ERBB2 BAC 
(clone 401b7 from the RPCI-81 canine BAC library), to 
CFAlql3.1. This is not in agreement with data from the 
present study nor with the assignment predicted from compara¬ 
tive chromosome painting studies, which would place the 
canine orthologue on CFA9 (Breen et al., 1999a; Yang et al., 
1999; Sargan et al. 2000). Clone 401b7 was isolated using a 
human ERBB2 cDNA, whose origin is not clear, but whose 
identity was described as having been verified by sequencing 
analysis. No ERBB2-related genes appear to lie in the human 
chromosome region corresponding to CFAlql3.1, namely 
HSA18ql2.2->q23. Clone 401b7 did not generate a positive 
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Table 2. The addresses of canine BAC clones representing each gene marker are followed by primer sequences, amplification conditions (annealing 
temperature in C, elongation time in minutes/MgCF concentration), product size and origin of the PCR markers used in the isolation of each clone 

Gene Canine BAC clones Primer 1 (5' - 3') Primer 2 (5' - 3') Amplification Product Origin 

conditions size (bp) 

NRAS 296d7,405p9 AAAAGCGCACTGACAATCC ATTGGTCTCTCATGGCACTG 60/lmin/1.5mM 2500 Lyons et al. (1997) 

CTACCTCTGGCTTTTGGTTC CTTCGTTATTGTACTCCTGAATGC 60/lmin/2.0mM 110 Designed to data derived from the 

above 

REL 017g24,041cl3 GGAAGTGTCAGAGGAGGAGATG TCACTAACTTCCTGGTCAGAAGG 57/lmin/2.0mM 820 Lyons et al. (1997) 

MYB 037k7 TGCTGAACCCTGAACTCATC GCTTGTGTGCCTGGTAAATG 57/lmin/3.0mM 500 Lyons et al. (1997) 

SAS 173124 GATTTAAAGAGACAGAAGCTGC ATGTAGACCACGTTGAGAGC 60/lmin/1.5mM 180 D. Sargan (pers. comm.) Primers 

also present in Genbank AJ388529 

CGATTTAAAGAGACAGAAGCTG GTAGACCACGTTGAGAGCGC Genbank AJ388529 (different from 

CUVS primers) 

TCCATGACGATGTAGATGGG 60/lmin/1.5mM 500 Venta et al. (1996) 

TCAATTCATGGAGGGATTCTG 57/lmin/2.0mM 1100 Genbank S81472 

GTTGAAGAAGAGACGGGAGC 55/lmin/3.0mM 125 Lyons et al. (1997) 

TGGAGTCTTCCCTTAGCGTTTA 60/lmin/1.5mM 280 Krempler et al. (2000) 

CTGGCTGTTTTCCTTTATCCAC 60/2min/2.0mM 1300 Genbank AF099030 

CCCTTCCACATGGATTGAAA 55/2min/2.0mM 1400 Venta et al. (1996) 

CCTCAGGGTAGGTGAAGACT 60/lmin/1.5mM 280 Thomas et al (2001b) 

CTGTTAAGAGAGGTGGGGAATG 60/lmin/2.0mM 200 Genbank AF250234.1 

CACGCCCACGGATCTG 57/lmin/1.5mM 1200 Lyons et al. (1997); 

Thomas et al. (2001c) 

WT1 183b9,229c9 GTTTTACCTGTATGAGTCCT GAGAAACCATACCAGTGTGA 55/2min/2.0mM 850 Venta et al. (1996) 

TGGTTTCACACCTAAATGGACA CAGCTGAGACTGGTGGGAAG 60/lmin/1.5mM 275 Designed to data derived from the 

above 

NF1 217g22,287112 CAAAGCTTCTGTGACTGTTT ATTCACTCTCTGTGTACTTG 60/2min/1.5mM 1200 Venta et al. (1996) 

TTCTAAAGTGCCAAATTTCACG TAACAGTGTTGTGTTACTGTCA 60/lmin/1.5mM 165 Designed to data derived from the 

above 

140 Genbank X95367.1 

180 D. Sargan (pers. comm.) 

220 Genbank AF100705 

120 Genbank S42999 

320 IMAGE locus stSG53760 

(Deloukas et al. 1998) 

TSC2 319j 14, 319j 19 AATGGGCCTCTCTTCTATGC GACCGTCACCCCACTCTG 60/lmin/1.5mM 270 Genbank AF 152608 

HRAS 245k4 CCTGTACTGGTGGATGTCC GACTCCTATCGGAAGCAAGT 55/lmin/1.5mM 180 Mayr et al. (1999) 

PDGFB 156al5 GGTGACCATTCGGACGGTG TGGCTCCGAGGGTCTCCTTC 55/lmin/1.5mM 120 D. Sargan (pers. comm.) 

RBI 312b24 CAACTGCACAGTGAATCCA CTGAGAGCCATGCAAGGG 50/lmin/2.0mM 750 M. Das (pers. comm.) 

ERBB2 311 o 18 TGTGACGTGGGAACTGATGAC CTTCCAAGGCTCCTTTAACCATAG 60/lmin/1.5mM 280 Parker et al (2000) 


MYC 396j5 GTTGTTTCTGTGGAAAAAAGGC GTTGTGCTGATGGGTGGAC 60/lmm/1.5mM 

CDK4 179d9,266e20 CTCTCTTCTGTGGAAACTCTG TCCTCCATCTCAGGTACCAC 60/lmin/1.5mM 

MDM2 16318, 191bl2 CAGCTTCGGAACAAGAGACC TTGGCACTCCAAACAAATCTC 60/lmin/l,5mM 

KRAS 316al 7 CTGAATATAAACTTGTGGTAGTTG CTATTGTAGGATCATATTCATCC 60/lmin/1.5mM 

RAF1 303d8,318k6 AGGTGCATGAAGAGGTCCC CCTCGGAAACAGCTCAGTTC 55/lmin/1.5mM 


FES 173a8, 191f3, GGGGAACTTTGGCGTGTT 

198ol9,284a21 

YES1 158d 12, 210f6, 215e5 GTCTGATGTCTGGTCATTTGGA 

INSR 401o24 ACCTCAGTTTCCCCAAACTC 

PAX3 042c 13,257h23 TGCGTCTCTAAGATCCTGTGC 

KIT 170b9,265122 GAGTATCATCGGCTCTGCTTG 

RET 199a 15 CATCCAGTTAGCATATACAC 

FOS 145a3 CCGTCAAGAGCATCGGCAG 

BRCA1 312m21,32116 GTGAGTGAATGACAGTGGGAAA 

TP53 278e24 GTGTAACAGTTCCTGCATGGG 


hybridisation signal during BAC library screening for ERBB2 
in the present study, using the marker primers of Parker et al. 
(2001). We have also applied these primers to DNA isolated 
from clone 401b7 with appropriate controls, and no product 
was obtained. We have confirmed the assignment of clone 
401 b7 by FISH to CFA1 q 13dist. q 14 (data not shown). More 
detailed sequencing analysis of both putative ERBB2 BAC 
clones may help to elucidate the most likely explanation for 
these observations. 

This study reports the isolation and chromosomal assign¬ 
ment of 41 large genomic clones representing 25 canine cancer- 
related genes. The data obtained have great value as additional 
mapping resources in confirming and refining existing knowl¬ 
edge of regions of conserved synteny between human and 
domestic dog genomes. The isolation of these genomic clones 
will allow evaluation of the degree of DNA sequence conserva¬ 
tion between these genes in both species. Their nature as canine 
orthologues of human genes involved in cell cycle regulation 
and/or tumourigenesis also makes them ideal templates for 
more direct studies of these genomic regions. This may be by 
detailed DNA sequencing analysis, to determine the presence 
of specific nucleotide alterations that correlate with the disease 


state either in inherited or sporadic cancers. The panel of clones 
also represents useful molecular cytogenetic resources, in identi¬ 
fying and characterising chromosomal regions shown to be 
recurrently aberrant in tumour cells. They may act as markers 
for the abnormal co-location of genes brought together as a result 
of chromosome translocation events, or to detect over- or 
under-abundance of the genomic site on which they lie in 
instances of chromosome imbalance, particularly pertinent in 
the case of proto-oncogenes and tumour-suppressor genes. Their 
application in the latter instance has been investigated by their 
inclusion in the development of a canine BAC microarray, a 
resource for performing high-resolution comparative genomic 
hybridisation (CGH) analysis of recurrent chromosomal imbal¬ 
ances in canine tumours (Thomas et al., pp 254-260 this issue). 
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Abstract. As with many human cancers, canine tumors 
demonstrate recurrent chromosome aberrations. A detailed 
knowledge of such aberrations may facilitate diagnosis, progno¬ 
sis and the selection of appropriate therapy. Following recent 
advances made in human genomics, we are developing a DNA 
microarray for the domestic dog, to be used in the detection 
and characterization of copy number changes in canine tumors. 
As a proof of principle, we have developed a small-scale 
microarray comprising 87 canine BAC clones. The array is 
composed of 26 clones selected from a panel of 24 canine can¬ 


cer genes, representing 18 chromosomes, and an additional set 
of clones representing dog chromosomes 11, 13, 14 and 31. 
These chromosomes were shown previously to be commonly 
aberrant in canine multicentric malignant lymphoma. Clones 
representing the sex chromosomes were also included. We out¬ 
line the principles of canine microarray development, and 
present data obtained from microarray analysis of three canine 
lymphoma cases previously characterized using conventional 
cytogenetic techniques. 

Copyright©2003 S. Karger AG, Basel 


Chromosome-based comparative genomic hybridization 
(CGH) (Kallioniemi et al., 1992) is a fluorescence in situ 
hybridization (FISH) technique that provides a means to iden¬ 
tify imbalanced genomic aberrations in human tumors, with¬ 
out the need to generate tumor chromosome preparations. In 
the last decade, CGH analysis has been widely used to demon¬ 
strate genome imbalances in a variety of human tumor types 
(for example: Lichter, 2000). In its conventional form, CGH 
analysis uses metaphase chromosome preparations from nor- 
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mal cells as the target for hybridization with tumor DNA as the 
probe. Regions of the tumor genome demonstrating copy num¬ 
ber changes are described using conventional cytogenetic ter¬ 
minology. We recently optimized CGH technology for applica¬ 
tion to the dog (Dunn et al., 2000) and used this approach to 
investigate chromosomal aberrations associated with canine 
lymphoma (Thomas et al, 2001, 2003a), osteosarcoma and 
brain neoplasia (Breen, personal communication). 

Compared to human CGH analysis, there are a number of 
additional limitations associated with canine metaphase-based 
CGH, all of which are related to the use of chromosome prepa¬ 
rations as the target for hybridization. First, an ability to identi¬ 
fy accurately each chromosome by banding analysis is of funda¬ 
mental importance if metaphase-based CGH is to be applied 
successfully. The difficulty in reliably identifying dog chromo¬ 
somes in this manner frequently necessitates verification of the 
identity of aberrant chromosomes with chromosome-specific 
reagents such as whole chromosome paint probes and single 
locus probes. While such reagents are now available, the prob¬ 
lems associated with identification of canine chromosomes pre¬ 
cludes all but the most experienced canine cytogeneticists from 
using metaphase-based CGH analysis as a means to routinely 
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Table 1. Identification and chromosomal assignment of each BAC clone on the canine array. Clones are listed in genomic order, followed by the assigned 
clone number, the clone ID (gene name or chromosome of origin as appropriate) and then by its precise sub-chromosomal assignment determined by FISH 
analysis. With the exception of clone 322, all clones produced a unique cytogenetic location when analyzed by FISH. Clone 322 demonstrated a primary 
hybridization site at 13q21.1 q21.2, but also secondary hybridization sites (denoted “++”) on a number of other chromosomes. 


Genomic 

order 

Code 

Clone ID 

Assignment 

Genomic 

order 

Code 

Clone ID 

Assignment 

Genomic 

order 

Code 

Clone ID 

Assignment 

001 

281 

MYB 

1 q 14 

030 

246 

CFA 13 

13q 11 

059 

415 

RAF1 

20ql ldist-ql2prox 

002 

298 

FES 

3q22.1 dist—q22.3 

031 

277 

MYC 

13q 12.3 

060 

283 

INSR 

20ql6 

003 

424 

TP53 

5q21 

032 

335 

CFA 13 

13q 13 

061 

767 

RBI 

22q 11.2 

004 

423 

TSC2 

6q21.2—q21.3 

033 

080 

CFA 13 

13q13—q14 

062 

450 

KRAS 

27ql3 

005 

290 

YES1 

7q22-q24 

034 

403 

CFA 13 

13q13—q14 

063 

251 

RET 

28q 12 

006 

800 

FOS 

8q31 

035 

299 

CFA 13 

13q 14 

064 

247 

MDM2 

31 q 12 

007 

426 

BRCA1 

9q21.3 

036 

641 

CFA 13 

13q 14 

065 

266 

MDM2 

31 q 12 

008 

445 

NF1 

9q23 

037 

322 

CFA 13 

13q21.1—q21.2 ++ 

066 

202 

CFA 31 

31 q 12 

009 

272 

CFA 9 

9q23-q24 

038 

599 

CFA 13 

13q21.1—q21.2 

067 

039 

CFA 31 

31 q 12dist 

010 

269 

SAS 

10ql2-ql4prox 

039 

268 

KIT 

13q21.2 

068 

635 

CFA 31 

31 q 14 

011 

409 

PDGFB 

10q23 

040 

052 

CFA 13 

13q21.2—q21.3 

069 

012 

CFA 31 

31 q 14 

012 

270 

REL 

10q27 

041 

363 

CFA 13 

13q21.3dist 

070 

211 

CFA 31 

31 q 14 

013 

680 

CFA 11 

1 lql2.1prox 

042 

400 

CFA 13 

13q22.1 

071 

512 

CFA 31 

31 q 15 

014 

317 

CFA 11 

11 q 12.1 

043 

502 

CFA 14 

14q13—q14 

072 

220 

CFA 31 

31 q 15.1 

015 

566 

CFA 11 

1 lql2.3 

044 

543 

CFA 14 

14q13—q 14 

073 

206 

CFA 31 

31q 15.2 

016 

646 

CFA 11 

11 q 12.3—q 13 

045 

671 

CFA 14 

14q 14 

074 

453 

PAX3 

37q16—q17 

017 

201 

CFA 11 

11 q 13 

046 

406 

CFA 14 

14q14—q15 

075 

803 

CFA X 

Xp22 

018 

865 

CFA 11 

11 q 13 

047 

292 

CFA 14 

14q15—q21.1 

076 

026 

CFA X 

Xp21.3 

019 

866 

CFA 11 

11 q 13 

048 

524 

CFA 14 

14q15—q21.1 

077 

065 

CFA X 

Xp21.2 

020 

579 

CFA 11 

1 lql4prox 

049 

651 

CFA 14 

14q21.1 

078 

372 

CFA X 

Xp21.1 

021 

204 

CFA 11 

11 q 15 

050 

655 

CFA 14 

14q21.2 

079 

027 

CFA X 

Xp21.1 

022 

507 

CFA 11 

11 q 15 

051 

511 

CFA 14 

14q21.3prox 

080 

593 

CFA X 

Xpll.3 

023 

672 

CFA 11 

11 q 15 

052 

303 

CDK4 

15q24.3 

081 

611 

CFA X 

Xpll.l 

024 

639 

CFA 11 

11 q 16 

053 

419 

CDK4 

15q24.3 

082 

186 

CFA X 

Xcen 

025 

239 

CFA 11 

11 q21 

054 

435 

NRAS 

17q22-q23 

083 

260 

CFA X 

Xq21.3 

026 

582 

CFA 11 

1 lq21-q22.1prox. 

055 

429 

HRAS 

18q22.1 

084 

215 

CFA X 

Xq22 

027 

226 

CFA 11 

1 lq22.1 

056 

311 

WT1 

18q22.2-q22.3 

085 

804 

CFAY 

Yq 11.1 

028 

329 

CFA 11 

Ilq22.1-q22.2 

057 

306 

CFA 18 

18q22.3 

086 

808 

CFAY 

Yql 1.1-ql 1.21 

029 

203 

CFA 11 

Ilq22.2-q22.3 

058 

653 

CFA 18 

18q25 

087 

365 

CFAY 

Yqll.2 


identify, with confidence, the aberrations associated with dog 
cancers. Second, even in experienced hands, metaphase-based 
CGH analysis is time consuming and labor intensive. Each case 
requires a considerable time investment for the collection of 
suitable images for analysis and a high degree of expertise is 
needed to ensure that all chromosomes are karyotyped correct¬ 
ly. Third, metaphase-based CGH has a resolution limited to 
approximately 5-10 Mb. While this is acceptable for the detec¬ 
tion of aneuploidy and large regions of chromosome gain/loss, 
there is a need to develop higher resolution technology that will 
enable us to identify smaller regions of change. 

Significant advances have been made recently in the devel¬ 
opment of array-based CGH technology as applied to human 
cancers. Array CGH analysis is performed in a similar manner 
to metaphase-based CGH analysis, with the exception that a 
panel of genomic clones, immobilized on a glass slide, replaces 
metaphase chromosomes as the target. A variety of approaches 
have been used in the development of genomic microarrays 
(Solinas-Toldo et al., 1997; Pinkel et al., 1998; Albertson et al., 
2000; Hodgson et al., 2001; Snidjers et al., 2001). Recently, 
Fiegler et al. (2003) described a method of generating genomic 
microarrays for human CGH analysis using degenerate oligo¬ 
nucleotide-primed (DOP) PCR amplification of large insert 
genomic clones. This method used DOP-PCR primers de¬ 
signed specifically to be inefficient at amplifying E. coli host 
DNA whilst retaining the ability to generate good representa¬ 


tions of human sequence. The approach is now at a stage where 
technology could be adapted for non-human applications. We 
therefore used this approach to generate a panel of DOP-PCR 
products from selected canine BAC clones to produce the first 
canine genomic microarray. The use of this array to detect 
known aberrations in cases of canine muliticentric malignant 
lymphoma that we have analyzed previously by metaphase- 
based CGH (Thomas et al., 2003a) is described. 

Materials and methods 

Selection of loci represented on the array 

A total of 96 loci were represented on the array of which 87 represented 
canine BAC clones selected from the RPCI-81 canine genomic BAC library 
(Li et al., 1999) (Table 1). Of these, 26 clones contained 24 putative canine 
cancer-related genes (Thomas et al., 2003b), mapping to 18 distinct chromo¬ 
somes. The panel of cancer-associated BACs was supplemented with 61 
clones selected on the basis of their location within the canine integrated 
genome maps (Breen et al., 2001; Guyon et al., 2003). This included 14 
clones mapping to the canine sex chromosomes (11 clones on dog chromo¬ 
some [CFA] X and three on CFAY), which permitted analysis of sex-mis¬ 
matched (male “vs” female) hybridizations. The remaining 47 clones were 
selected to represent those chromosomes found to be most commonly aber¬ 
rant in prior metaphase-based CGH studies of canine lymphomas (Thomas 
et al., 2001, 2003a), namely CFA11 (17 clones), CFA 13 (11 clones), CFA14 
(9 clones) and CFA31 (8 clones). The chromosome assignment of each canine 
BAC clone was confirmed by FISH analysis as previously described (Thomas 
et al., 1999), using the chromosome nomenclature of Breen et al. (1999a, 
1999b). The order of clones along each chromosome was determined by a 
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series of co-hybridization reactions, each comprising five differentially 
labeled probes (Table 1). 

Three Drosophila BAC clones, derived from the RPCI-98 library (http: 
//www.chori.org/bacpac), were included on the array as controls for the 
assessment of non-specific hybridization (Fiegler et al., 2003). Vector DNA 
isolated from the BAC library vector (pBACe3.6), and canine, feline, human 
and hamster whole genomic DNA controls were processed alongside all 
canine BAC templates, in addition to a negative (water-only) control. 

DNA microarrays 

Canine BAC clones were obtained from the Human Genome Mapping 
Project Resource Centre (HGMP-RC, Hinxton, Cambs, UK). DNA ex¬ 
tracted from single colonies was amplified by DOP-PCR as described pre¬ 
viously (Fiegler et al., 2003) and arrayed in triplicate onto 3D link-activated 
slides (Motorola) using a MicroGrid II arrayer (BioRobotics). The maximum 
dimensions of the arrayed region were 20 mm x 5 mm. Details of post-array¬ 
ing treatment of slides can be found at http://www.sanger.ac.uk/Projects/ 
Microarrays. 

Labeling of genomic DNA probes 

The reference individuals used in this study were shown previously to 
demonstrate a normal karyotype (Dunn et al., 2000). Test (tumor) and refer¬ 
ence (normal) genomic DNAs were labeled in separate reactions using a Bio- 
Prime Labelling Kit (Invitrogen) with modified conditions. Briefly, a 
130.5-pl reaction was set up containing 450 ng of either test or reference 
genomic DNA and 60 pi of 2.5 x Random Primers Solution. After denaturing 
the mixture at 100° C for 10 min, 15 pi of a dNTP mix (1 mM dCTP, 2 mM 
each of dA, dG and dTTP), 1.5 pi of 1 mM Cy3-dCTP or Cy5-dCTP (Amers- 
ham Biosciences) and 3 pi of Klenow fragment were added on ice and subse¬ 
quently incubated at 37 °C overnight. The reaction was terminated by addi¬ 
tion of 15 pi of 500 mM EDTA and the labeled probes were purified using 
G50 sephadex spin columns (Microspin G50, Amersham Biosciences). 

Array hybridization 

Array hybridization was generally performed as described previously 
(Fiegler et al, 2003) with slight modifications. Briefly, for prehybridization of 
the arrays, 35 pg of canine Cot 1 DNA (Dunn et al, 2000) and 200 pg of 
herring sperm DNA (Sigma) were combined, precipitated and resuspended 
in 40 pi of hybridization buffer (50% formamide, 10% dextran sulphate, 
0.1 % Tween 20,2x SSC, 10 mM Tris pH 7.4). Prehybridization of the arrays 
was performed as described previously (Fiegler et al., 2003). 

Differentially-labeled test and reference genomic DNA probes (40 pi 
each) were co-precipitated with 35 pg of canine Cotl DNA and resuspended 
in 20 pi of hybridization buffer and 1.5 pi yeast tRNA (100 pg/pl). The com¬ 
bined probe was denatured at 72 °C for 10 min, incubated at 37 °C for 
60 min and hybridized to the array at 37 °C for 48 h as previously described 
(Fiegler et al., 2003). 

Imaging and data analysis 

Slides were scanned using an Axon 4000B scanner (Axon Instruments 
Inc.) and the resulting images analyzed with GenePix Pro 3.0 (Axon Instru¬ 
ments Inc.). Spots were defined by the automatic grid feature and the grid 
was manually adjusted where necessary. Cy3 and Cy5 fluorescence intensi¬ 
ties were calculated for each spot after local background subtraction, and 
ratios of normalized values established. The mean fluorescence ratio of each 
triplicate was then reported. 

Array analysis of canine lymphoma cases 

To examine the correlation between conventional CGH and array CGH 
data, three canine multicentric lymphoma cases were selected for array CGH 
analysis. Metaphase-based CGH analysis of these cases previously showed 
the presence of imbalanced aberrations that included CFA11, CFA13, 
CFA14 or CFA31, corresponding to those autosomes for which the most 
BAC clones were intentionally represented on the array. Test and reference 
DNA samples (of the same sex) were differentially labeled with Cy3 and Cy5 
conjugated nucleotides respectively. Data analysis was performed as de¬ 
scribed above. Signal intensities were normalized to a mean 1:1 ratio on 
autosomal clones, reflecting a modal ratio after subtraction of local back¬ 
ground intensities. Following standard conventions for CGH analysis, clones 
demonstrating a test:reference fluorescence ratio greater than 1.15:1 (gain) or 
less than 0.85:1 (loss) were classified as aberrant. 


Results 

FISH analysis of canine BAC clones 

Of the 87 canine BAC clones represented on the array, 86 
clones hybridized to a unique and consistent chromosomal 
location. The exception, clone 322, demonstrated a primary 
hybridization site at CFA13q21.1 —> q21.2, plus weaker hybrid¬ 
ization to multiple sites throughout the genome. Clone 653 was 
proposed by RH-mapping analysis to lie on CFAX, but FISH 
analysis demonstrated a unique hybridization site on CFA18 
and so the clone was reclassified as an anonymous CFA18 
marker. Clones 272 and 306 were additional putative canine 
cancer gene markers, but their identity was not confirmed by 
DNA sequence analysis and so they were included in the panel 
as anonymous CFA9 and CFA18 markers, respectively. The 
identities of the remaining 26 cancer gene markers were con¬ 
firmed and are reported elsewhere (Thomas et al., 2003b). The 
chromosomal assignment of each clone is shown in Table 1. 

Validation of control hybridization data 

Validation of the array was performed through a series of 
test hybridizations, including differentially labeled canine ref¬ 
erence genomic DNA (self-self hybridizations), and canine 
male (Cy3) and female (Cy5) reference genomic DNA (sex mis¬ 
matched hybridizations). Hybridization intensities were calcu¬ 
lated for each spot after local background subtraction. Signal 
intensities were normalized to a mean 1:1 ratio on the autoso¬ 
mal clones. The linear ratio of the intensities was then plotted 
against the order of the clones along the genome. Typical exam¬ 
ples of such hybridizations are shown in Fig. 1. While all clones 
on the array show a ratio closely approximating 1:1 (range 0.95 
to 1.05) in self-self hybridizations (Fig. la), a distinct copy 
number gain can be detected for CFAX clones in sex mis¬ 
matched hybridizations (Fig. lb), with a corresponding loss 
apparent for CFAY clones. Fluorescence ratios for the ten 
CFAX-specific clones ranged from 1.35:1 to 1.81:1 (female: 
male) resulting in a mean ratio of 1.56:1. 

Calculation of signal to background ratios 

The signal to background (S/B) ratio is a measure of specific 
signal intensity. For the present data S/B represents the ratio of 
the mean intensity of hybridization of canine DNA to canine 
BACs divided by the mean intensity of hybridization of canine 
DNA to Drosophila control BAC clones. In two self-self hybrid¬ 
izations, differentially labeled male or female canine DNA was 
hybridized onto the arrays. Hybridization intensities were de¬ 
termined for each spot after local background subtraction. The 
S/B ratio was then calculated by dividing the Cy3 or Cy5 inten¬ 
sities for the canine and control clones by the mean Cy3 or Cy5 
intensities of the correspondingly amplified Drosophila BAC 
clones. As shown in Table 2, we observed a mean S/B ratio of 
approximately 20 in both Cy3 and Cy5 channels for all canine 
autosomal clones on the array. Hybridization to total canine 
genomic DNA, as a control, resulted in an increase (approxi¬ 
mately threefold) of the S/B ratio in both channels. In contrast, 
all other genomic controls (hamster, human and feline), as well 
as vector DNA, showed a dramatically decreased (4-10 fold) 
S/B ratio, with vector DNA and hamster/human genomic 
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DNA producing negligible cross-hybridization with labeled ca¬ 
nine DNA. No signal was observed on spots representing the 
water control. 

Correlation of metaphase-based and array-based CGH data 

Metaphase CGH analyses of the three cases of canine lym¬ 
phoma have been described previously (Thomas et ah, 2003a) 
and array CGH data are presented in Fig. 2. Case 4757/00 is a 
low-grade B-cell lymphoma of the centroblastic-centrocytic 
form isolated from a female dog. Conventional CGH analysis 
of this case showed gain of the full length of both CFA13 and 
CFA31. In the array-CGH analysis, all clones with a unique 
cytogenetic location representing CFA13 exhibited a copy 
number gain, as did eight of the ten CFA31 clones. The excep¬ 
tions were clones 247 and 266, which lie in close proximity to 
the CFA31 centromere. A copy number loss was also observed 
for all ten CFAX clones, which was confirmed by metaphase- 
based CGH (data not shown). Array CGH was thus able to 
identify the aberrations observed by metaphase-CGH. 

Case 3280/98 is a male high-grade lymphoblastic lymphoma 
of mixed B- and T-cell origin. Conventional CGH analysis of 
this case showed four gross recurrent chromosome aberrations 
(loss of CFA11, 30 and 38, and gain of CFA36), of which only 
the loss of CFA11 could be detected using this array. Meta- 
phase-based CGH suggested that the region of loss extended 
along the entire length of CFA11, with the exception of the cen- 
tromeric region. Array analysis supported these data, generat¬ 
ing normal copy number ratios for the three most centromeric 
clones (680, 317 and 566) and a copy number loss for the 
remaining 14 CFA11 clones. Clones 298 (representing the FES 
gene on CFA3), 423 (TSC2 on CFA6), 409 (PDGFB on 
CFA10), 429 (HRAS on CFA18) and 283 (INSR on CFA20) 
also demonstrated copy number gain. These copy number 
changes were not detected by metaphase CGH. 

Case 1996/01 is a male low-grade B-cell lymphoma of the 
centroblastic-centrocytic form. Metaphase CGH of this case 
demonstrated gain of CFA13 and loss of CFA14 and CFAY, 
with the gain of CFA13 being interrupted by a small region with 
an apparently normal copy number at CFA13ql ldist —> 
ql2.1prox. Array analysis confirmed these data, with clone 641 
(CFA13ql4) being the sole locus from this chromosome to pro¬ 
duce a normal ratio. Array CGH thus indicated that the small 
balanced region of CFA13 was slightly distal than was previously 
suggested by metaphase CGH. This difference is likely to be a 
consequence of the improved resolution of array CGH over 
metaphase CGH. Both conventional and array CGH data sug¬ 
gest that the gain of CFA13 may be present in a smaller propor¬ 
tion of the cell population than was observed for case 4757/00 
and may also possess a degree of clonal variation, since the mid¬ 
region of the chromosome shows ratios close to normal limits. 
All nine CFA14 clones showed a clear loss of copy number, cor¬ 
relating fully with data from metaphase CGH. The under-repre¬ 
sentation of all three CFAY clones implies deletion of this chro¬ 
mosome in the tumor. This is consistent with observations from 
prior metaphase CGH analysis. Clones 298,423 and 429, repre¬ 
senting the FES (CFA3), TSC2 (CFA6) and HRAS (CFA18) 
genes, also demonstrated a copy number gain of the correspond¬ 
ing genomic region, as does clone 653 (CFA18). 



b 


Female-Male Hybridisation 


1.5 


0.5 


0 


0 


20 


CN 

<N 

ro 



40 


60 


80 


100 


Clone Order 


Fig. 1. Typical array CGH profile of (a) a self-self hybridization per¬ 
formed with genomic DNA of a normal female control and (b) a sex-mis¬ 
matched hybridization. Mean, normalized, background-subtracted fluores¬ 
cence ratio data are presented. Error bars show the limited standard devia¬ 
tion (SD) in fluorescence ratios for triplicate spots representing the same 
locus. Clones that lie above the green 1.15:1 ratio bar represent copy number 
gains, and those below the red 0.85:1 bar represent genomic losses. Clone 322 
on chromosome CFA13, which maps to multiple locations in the genome, is 
highlighted. 


Table 2. Signal to background (S/B) ratios and the assessment of non¬ 
specific hybridization. S/B ratios were calculated for both Cy5 and Cy3 chan¬ 
nels for canine-specific autosomal BAC clones, vector DNA and for all 
genomic controls present on the array (canine CFA; hamster MAU; human 
HSA; feline FCA). 


Spotted DNA 

S/B Ratio 

S/B Ratio 

Cy5 channel 

Cy3 Channel 

BAC clones (Autosomes) 

22.88 

17.49 

CFA genomic DNA 

66.73 

48.72 

MAU genomic DNA 

2.20 

1.78 

HSA genomic DNA 

4.46 

3.53 

FCA genomic DNA 

10.32 

7.47 

Vector control 

1.17 

1.07 

Water control 

N/A 

N/A 
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Fig. 2. Array CGH profiles of three lymphoma cases (1996/01, 3280/98 
and 4757/00). Mean, normalized, background-subtracted fluorescence ratio 
data (testreference) are presented. Error bars show the SD of the ratios of the 
triplicate spots. Regions of the genome showing copy number imbalances are 
highlighted. Clones that lie above the green 1.15:1 ratio bar represent copy 
number gains, and those below the red 0.85:1 bar represent copy number 
losses. 


This study presents a small-scale dog cancer-gene BAC 
microarray for CGH analysis of canine tumors. The array has 
been developed following existing procedures for human CGH 
arrays (Fiegler et al., 2003) and subjected to a series of valida¬ 
tion experiments. Self-self hybridizations and sex-mismatched 
hybridizations using genomic reference DNA from clinically 
normal control individuals confirmed the ability to generate 
consistent and reproducible data. In addition, sex-mismatched 
hybridizations demonstrated clearly that the array is able to 
detect copy number imbalances for the canine sex chromo¬ 
somes against a normal autosomal background. 

Data derived by conventional and array-based CGH analy¬ 
ses correlated closely. In all three lymphoma cases, each of the 
aberrations identified by metaphase-based analysis was suc¬ 
cessfully detected by the array. Moreover, knowledge of the 
precise chromosomal location of each BAC clone included on 
the array, enabled the boundaries of genomic imbalances to be 
resolved more accurately than is possible with conventional 
analysis. Furthermore, additional imbalances, such as those for 
BACs representing cancer-associated genes, were detected by 
array CGH that were not identified by conventional analysis. 
These data reflect the improved resolution limits of the array 
technique over that of a metaphase-based approach. 

It was noted that female:male fluorescence intensities for 
the ten CFAX clones approached, but did not reach, the 
expected 2:1 ratio, ranging instead from 1.35:1 to 1.81:1 (mean 
1.56). This, however, compares well to previously reported val¬ 
ues of 1.73:1 (Fiegler et al., 2003) and 1.65:1 (Snijders et al., 
2001) in human array CGH. Underestimation of fluorescence 
ratio changes representing single copy number imbalances of 
the X chromosome in human array studies has largely been 
attributed to incomplete suppression of repetitive elements in 
the DNA targets (Pinkel et al., 1998; Snijders et al., 2001). It is 
likely that this may also be the reason for the lower ratios 
observed with our canine array. Similarly, as with conventional 
CGH, it is possible that difficulty in achieving optimal suppres¬ 
sion of the highly repetitive DNA sequences in centromeric 
(and telomeric) regions prevents the establishment of accurate 
and consistent hybridization measurements. For this reason 
centromeric and telomeric sites are often excluded from CGH 
analysis (for example, Speicher et al., 1993). In the present 
study, increasing the concentration of canine competitor DNA 
did not appear to overcome these limitations (data not shown) 
and so it is most likely that parameters other than simply con¬ 
centration of the competitor will play a role. Further modifica¬ 
tions to hybridization and washing conditions, and to the 
nature of the competitor DNA itself, may be required in order 
to achieve optimal suppression of repetitive sequence in the 
canine CGH array. 

Underestimation of single copy number imbalances has also 
been attributed to contamination of arrayed DNA by E. coli 
DNA derived from the host vector during clone DNA isolation. 
Such contamination contributes to non-specific hybridization 
and thus skews the expected fluorescence ratio. DOP-PCR 
amplification of canine BAC DNA using PCR primers specifi¬ 
cally designed to minimize amplification of contaminating bac- 
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terial DNA (Fiegler et al., 2003) helped to overcome this prob¬ 
lem. 

Non-specific binding of probe DNAs may result where 
arrayed clone DNA shares extensive sequence similarity to 
multiple genomic sites, such as is the case for gene families. 
This issue has been minimized in the present study by perform¬ 
ing conventional FISH analysis of each clone to confirm its 
localization to a unique chromosomal location, and to investi¬ 
gate anomalous findings. Clone 322, assigned by RH-mapping 
analysis to CFA13, showed a copy number gain in sex-mis¬ 
match hybridizations (Fig. lb) and an unexpected copy num¬ 
ber gain in case 3280/98, in which no CFA13 aberration had 
been observed by conventional CGH analysis. In addition, 
clone 322 showed a 1:1 ratio in case 4757/00, while all remain¬ 
ing CFA13 clones showed a copy number gain, as was predicted 
from conventional CGH analysis. FISH analysis of clone 322 
demonstrated multiple hybridization signals, with the primary 
site at CFA13q21.1 —> q21.2. Since this clone was shown not to 
be chimeric by RH mapping of BAC end sequences, the anoma¬ 
lous data are most likely due to sequence identity with other 
genomic sites, such as the presence of highly repetitive se¬ 
quence motifs. The non-specificity of FISH signal is likely to 
explain the atypical array CGH data obtained for this clone. 
Clone 653 was proposed by RH-mapping analyses to lie on 
CFAX, but in array analysis generated signal ratio data incon¬ 
sistent with that expected for a clone residing on this chromo¬ 
some. FISH analysis demonstrated that clone 653 hybridized 
solely to CFA 18 and it was therefore reclassified as a CFA 18 
marker. These observations highlight the importance of verify¬ 
ing unequivocally the chromosomal location of each clone prior 
to use in array analysis, as anomalies such as these may not 
always be so readily apparent. 

Additional support of array data would be provided by the 
ability to use subsets of the arrayed clones in direct cytogenetic 
evaluation of tumor genomes. This is of particular interest for 
single locus markers representing key genes of interest, such as 
the cancer-related genes described here, as these would provide 
direct and quantifiable evidence, using either metaphase or 
interphase analysis of tumor cells, to demonstrate copy number 
changes of loci that have known biological effects. It would also 
be of value in resolving ambiguities, such as the observations in 
case 1996/01 where regions of CFA13 gain did not appear to 
correlate precisely between array and metaphase CGH analy¬ 


ses. Previously, direct cytogenetic analyses of canine tumors 
have predominantly utilized whole chromosome paint probes 
for detection of recurrent aberrations, which are dependent on 
availability of metaphase chromosome spreads. Limitations to 
the generation of high quality chromosome preparations from 
tumor cells have been widely documented, in part due to the 
decrease in cell viability resulting from the often unavoidable 
time delay incurred between biopsy and sample receipt. The 
need for metaphase chromosome preparations also limits the 
use of archived case material, which may only be available as 
paraffin-embedded formalin-fixed tissue sections. With a ge¬ 
nome-wide panel of well-characterized, unique single locus 
BAC FISH probes, however, we are now in a position where 
direct interphase analysis of neoplastic cells is both possible 
and highly informative. It will therefore become increasingly 
important to ensure the availability of tumor interphase nuclei, 
either by culture of viable cells or from fixed-tissue sections. 

The data presented in this report confirms that genomic 
array CGH represents a highly valuable and accessible technol¬ 
ogy for cancer genome studies in the dog. Validation trials con¬ 
firmed the ability to generate reproducible results both within 
and between hybridization experiments, and data from con¬ 
ventional and array-based analyses show extensive correlation. 
Array analysis has further resolved chromosome aberrations 
detected by prior metaphase CGH to precise subchromosomal 
locations. Significantly, by identifying genomic imbalances in 
tumor cells that were not detected by conventional CGH, the 
array data presented in this paper has clearly demonstrated the 
increased resolution of array “vs” metaphase CGH analysis. 
Interestingly most of the novel findings represented key genes 
involved in cell cycle regulation. Without array CGH these 
observations would likely have remained invisible at the cyto¬ 
genetic level. The copy number gains of the FES, TSC2 and 
HRAS loci reported here are particularly worthy of further 
investigation to evaluate their biological significance both with¬ 
in and between tumor cases. 
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Abstract. The proto-oncogene, c-kit (KIT), encodes a tyro¬ 
sine kinase receptor, and mutations in this gene are causative 
for several mammalian diseases, including cancer and a form of 
pigmentation-associated hereditary deafness. Our laboratories 
are interested in a form of hereditary deafness that is associated 
with abnormalities in pigmentation and is common in the Dal¬ 


matian. Thus, KIT is being analyzed as a candidate gene for 
deafness in this breed. In addition to our interest in deafness, 
we are involved in mapping gene loci in the canine genome. 
Reported here is the identification of two isoforms of canine 
C-kit and radiation hybrid mapping of KIT to CFA13. 

Copyright©2003 S. Karger AG, Basel 


The KIT gene is a proto-oncogene that has been implicated 
in mastocytomas, the most common tumor of dogs (London et 
al., 1999; Ma et al., 1999). Additionally, KIT is involved in 
pigmentation in mammals. For example, dominant white spot¬ 
ting in the mouse is caused by mutations at the W (Kit) locus 
(Chabot et al., 1988; Geissler et al., 1988). Mice harboring 
mutations in Kit exhibit impairment in pigmentation (white 
spots), anemia, sterility (Russell, 1979; Geissler et al., 1981), 
and loss of hearing (Cable et al., 1994). This latter finding, the 
fact that the product of KIT is involved in migration and differ¬ 
entiation of neural crest cells, (e.g., melanoblasts, hematopoiet¬ 
ic stem cells and germ cells) (Russell, 1979; Geissler et al., 
1981) and because deafness in the Dalmatian is clearly associ¬ 
ated with differences in pigmentation (i.e., blue eyes) (Strain et 
al., 1992) make KIT an ideal candidate gene for deafness in the 
Dalmatian. 
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In the human, the cDNA is 2.9 kb and consists of 21 exons 
encoding a 976 amino acid polypeptide (Yarden et al., 1987). 
An alternate splice site in the segment encoding the juxtamem- 
brane region of the extracellular domain of C-kit results in the 
presence (GNNK+) or absence (GNNK-) of the tetrapeptide 
sequence Gly-Asn-Asn-Lys in both the human and mouse 
(Reith et al., 1991; Crosier et al., 1993). Although the GNNK- 
form dominates, the isoforms are co-expressed in most tissues. 
Functional differences between the isoforms vary in different 
tissues (Ashman, 1999). For example, Reith et al. (1991) found 
that the two forms demonstrate different signaling properties in 
COS cells; however, the isoforms showed little difference when 
expressed in bone marrow-derived mast cells from Wsh mutant 
mice (Serve et al., 1995). 

Our long-term goal is to understand the genetics underlying 
deafness in the Dalmatian. As part of this effort, candidate 
genes (and/or corresponding cDNAs) such as KIT are being 
sequenced and mapped to the canine genome. Reported here is 
the identification of two isoforms of C-kit and radiation hybrid 
(RH) mapping of the gene. 

Materials and methods 

Renal cortex samples from a deaf Dalmatian and from a normal Labrad¬ 
or Retriever were collected at necropsies and stored in RNAlater (Ambion, 
Austin, TX). Extraction of mRNA from the samples was done using the Poly- 


KARGER 


Fax+ 41 61 306 12 34 
E-mail karger@karger.ch 
www.karger.com 


© 2003 S. Karger AG, Basel 
0301-0171/03/1024-0261$ 19.50/0 


Accessible online at: 
www.karger.com/cgr 




Table 1 . Primer sequences for KIT 


Primer 

Primer sequence 

Annealing 
Temp. (°C) 

Estimated product 
size (bp) 

0 

F: 5 -CTCAGAGTCTATCGCAGCCACC-3' 

R: 5 -TCAGAGGGCAGCGGACCAA-3' 

60 

430 

1 

F: 5 -GCCTGGGATTTTCTCTGCGTC-3' 

R: 5 '-GCTGAGCTGATAATCAACTTTTCCTG-3' 

60 

800 

2 

F: 5 -GGACCAGAAGGGCAGGACG-3' 

R: 5 -AATACCAATCTACTGCGGGCTCTG-3' 

60 

740 

3 

F: 5'-T AACC AG ATT AA AAGGG A ACG A AGG AGGC-3' 

R: 5 '-CCGAAGGCACCAGCACCCAAAGT-3' 

67; 60 

660 

4 

F: 5 -CACACCCTGTTCACACCTTT-3' 

R: 5 '-AGCCAACTCATCATCTTCCAT-3' 

56 

715 

5 

F: 5 -CTCGCAGAATAGGCTCATACATAG-3' 

R: 5 '-AGAGGCTGGGTGGAAGACG-3' 

61; 54 

685 

6 

F: 5 -TTCTCTTTAGGAAGCAGCC-3' 

R: 5 '-ATCGCTCTTGTTGGGGA-3' 

53 

400 


TTTAAAGGTAACAGCAAAGAACAA 
TTTAAA-GAACAA 

FKGNSKEQ 


Fig. 1. The nucleotide and amino acid se¬ 
quence for KIT at the junction of exons 9 and 10 
showing the tetrapeptide isoforms. 


(A) Pure kit from Ambion. First-strand cDNA was synthesized using the e- 
AMV RT kit (Sigma-Aldrich, St. Louis, MO) with oligo(dt) primers. 

Seven overlapping primer sets (Table 1) were designed from canine 
sequence (GenBank sequence AF099030) to capture the 2.9-kb KIT cDNA. 
PCR was performed using Taq DNA polymerase in a total reaction volume 
of 25 pi. Conditions for amplification varied for primer sets. Primer sets 0,1, 
2, 4, and 6 used the following conditions: 1 min and 30 s 94 ° C, followed by 
35 cycles of 30 s 94 0 C, 30 s at annealing temperature listed in Table 1, 1 min 
72 0 C with a final extension 5 min 72 ° C. Primer sets 3 and 5 utilized touch¬ 
down PCR with the following conditions: 1 min and 30 s 94 0 C, followed by 8 
cycles of 30 s 94 0 C, 30 s at first annealing temperature (decreasing by 1 0 C 
per cycle), 1 min 72 °C and 27 cycles of 30 s 94 °C, 30 s at lowest annealing 
temperature, 1 min 72 0 C with a final extension 5 min 72 0 C. The amplified 
products were ligated into pCR4.0-TOPO (Invitrogen, Carlsbad, CA) and 
transformed into chemically competent Escherichia coli TOP-10 cells (Invi¬ 
trogen). Multiple clones representing the Dalmatian and Labrador Retriever 
KIT sequences were picked and sequenced. Vector NTI Suite II software 
(Informax, Bethesda, MD) was used to assemble contiguous cDNA frag¬ 
ments and to determine the deduced amino acid sequence for C-kit. 

The PCR primer pairs used in RH mapping of KIT are: 5 7 -GATTTGT- 
C A AGT GG ACTTTT GAG AC- 3' and 5 -CT CT G AC AAAC AC AT AAAT G- 
GACCT-3'. They generated a PCR product of 153 bp from canine genomic 
DNA. Amplification of KIT was tested using standard conditions individu¬ 
ally on canine and hamster DNAs and on a mixture (1:3) of canine and ham¬ 
ster DNA. Amplification was performed in 10 pi reactions containing 50 ng 
DNA, 0.3 pM of each primer, 250 pM of each dNTP (Pharmacia), 2 mM 
MgCh, lx AmpliTaq Buffer and 0.5 U AmpliTaq Gold (Perkin-Elmer). 
PCRs were carried out in PTC-200 PCR machines (MJ Research) with the 
following program: 7 min 95 °C, followed by 20 cycles of 30 s 94 °C, 30 s 
63 °C (decreasing of 0.5 °C per cycle), 1 min 72 °C and 15 cycles of 30 s 
94°C, 30 s 53°C, 1 min 72°C and a final extension of 2 min 72°C. The 
fragment derived from KIT was canine specific and could be readily typed on 
the radiation hybrid panel RHDF5000-2 composed of 118 cell lines (Vig- 
naux et al., 1999) using the above PCR conditions. PCR products were 
loaded on 2% agarose gels containing ethidium bromide and run in 0.5x 


TBE at 120 V for 30 min. Products were visualized under UV light, images 
were recorded and results were scored in terms of present, absent or ambi¬ 
guous in the 118 hybrid cell lines. The typing data were incorporated into the 
Breen et al. (2001) radiation hybrid map, using the two-point analysis of the 
Multimap package (Matise et al., 1994). 

Results and discussion 

The first aspect of this work focused on identification of iso¬ 
forms encoding different C-kit proteins. Several independent 
clones from both the Dalmatian and Labrador Retriever were 
sequenced. The deduced amino acid sequence identity between 
the Dalmatian and Labrador Retriever is 100%. Sequencing 
revealed that the Dalmatian and Labrador Retriever each pro¬ 
duce two different cDNAs (GenBank Accession numbers 
AY296484 and AY313776) that encode slightly different pro¬ 
teins. The isoforms result from an alternate splice site between 
putative exons 9 and 10 and are characterized by the presence 
or absence of the tetrapeptide Gly-Asn-Ser-Lys (GNSK) 
(Fig. 1). These two canine isoforms differ from those identified 
in the human and mouse in that they have a serine at the third 
amino acid position instead of an asparagine. 

The marker KIT was typed in duplicate on the 118-cell line 
RH panel and computed on the Breen et al. (2001) version of 
the RH map. The marker KIT was co-localized with two micro¬ 
satellite markers, REN126A23 and REN166I13, on CFA13 at 
a unique position. It is also close to one microsatellite marker, 
Cl3.365, with a lod score of 17 and to one gene marker, 
CNCG1 (current human gene symbol: CNGA1 for cyclic nu¬ 
cleotide gated channel alpha 1), with a lod score of 16. The 
placement of canine KIT is consistent with the conserved 
segment in the human. That is, human KIT is at HSA4ql2 
(http://genome.ucsc.edu) and this region corresponds to part of 
CFA13. 

The initial goal of this work was to identify isoforms and 
map KIT and the long-term goal is to determine whether it 
plays a role in deafness in the Dalmatian. Only one Dalmatian 
was examined and this obviously prevents any conclusions 
regarding KIT and deafness; however, further analysis of this 
gene is underway. 
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Abstract. In the present report we show the chromosomal 
localization of two BAC clones, carrying the leptin (LEP) and 
insuline-like growth factor 1 (IGF1) genes, respectively, in four 
species belonging to the family Canidae: the dog, red fox, arctic 


fox and the Chinese raccoon dog. The assignments are in agree¬ 
ment with earlier data obtained from comparative chromo¬ 
some painting for the dog, red fox and arctic fox. 

Copyright©2003 S. Karger AG, Basel 


In the family Canidae there are four species, namely the dog 
(Canis familiaris), red fox (Vulpes vulpes), arctic fox (Alopex 
lagopus) and raccoon dog (Nyctereutes procyonoides), which 
have been extensively studied using cytogenetic techniques (for 
review see Switonski et al., in press). Rapid progress of the 
physical genome map of the dog (Breen et al., 2001a) has facili¬ 
tated comparative studies on genome organization in these spe¬ 
cies. FISH mapping of the canine-derived probes carrying 
microsatellites in the genomes of the foxes and the raccoon dog 
were carried out (Yang et al., 2000; Rogalska-Niznik et al., 
2003; Szczerbal et al., this volume). On the other hand, there 
are very few reports demonstrating the comparative assign¬ 
ment of gene loci in these genomes. The development of canine 
BAC libraries brings new opportunities in this field (Schelling 
et al., 2002). 

In the present paper we report on the FISH mapping of two 
BAC clones, carrying leptin (LEP) and insuline-like growth fac¬ 
tor 1 (IGF1) genes, in four canids: the dog, red fox, arctic fox 
and Chinese raccoon dog. 
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Materials and methods 

Chromosome preparation and identification 

Blood samples were collected from a healthy dog, red fox, arctic fox and 
Chinese raccoon dog. The foxes and the Chinese raccoon dog originated from 
a local fur-farm. Chromosome preparations were obtained from short-term 
lymphocyte cultures and stained with the Q-banding technique prior to 
FISH. Chromosome nomenclature for the dog (Switonski et al., 1996; Breen 
et al., 1999), red fox (Makinen et al., 1985b), arctic fox (Makinen et al., 
1985a) and the Chinese raccoon dog (Pienkowska et al., 2002) were applied. 

Probe preparation for FISH experiments 

A canine BAC library (Schelling et al., 2002) was screened for LEP and 
IGF1 genes. Specific canine primers for the IGF1 gene (GenBank accession 
number L 08254; Klukowska et al., in press): 

Forward: 5 '-T C AC AT CT CTT CT ACCT GG- 3', 

reverse: 5 / -AAGTAGAACCCCCTGTCTCC-3 / 
and for the canine leptin gene (GenBank accession AB 020986; Chmurzyn- 
ska et al., 2003): 

Forward: 5 '-TT GT GG ACCT CT GT GCCGATT C- 3 

reverse: 5'-ATCCTGGCGACAATCGTCTTG-3' 
were used. Following the screening, the DNA of clone S032P02C09, contain¬ 
ing the LEP gene, and clone S083P05C12, containing the IGF1 gene, was 
isolated and labeled with biotin-16-dUTP by random priming. 

Fluorescence in situ hybridization 

The labeled probes with an excess of canine Cot-1 DNA were denatured 
for 8 min at 80 °C, preannealed for 30 min at 37 °C and applied onto dena¬ 
tured chromosome preparations. Hybridization was carried out overnight at 
37 °C. Signal detection and amplification was done using avidin-FITC and 
anti-avidin. Staining was performed with propidium iodide. Slides were ana¬ 
lyzed under a fluorescence microscope (Nikon E 600 Eclipse) equipped with 
a cooled digital CCD camera, driven by computer-aided software Lucia. 
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Table 1. Localization of two BAC clones in chromosomes of four species 
of the family Canidae 


BAC Probes 

Chromosome location 




Dog 

Red fox 

Arctic fox 

Chinese raccoon dog 

LEP 

14q11 a 

14q11-12 b 

7ql 1-12 

13q 12-13 

lpll 

IGF1 

15q23-24 a,b ’ c 

1 Oq 15 

23q 15 

18q 17 


a Chromosome nomenclature by Switonski et al. (1996). 
b Chromosome nomenclature by Breen et al. (1999). 
c Previously assigned (Breen et al., 2001b). 


Fig-1 . Partial Q-banded metaphase spreads of the dog (A) and the Chi¬ 
nese raccoon dog (B) and the same metaphases after FISH with the BAC 
probe carrying the leptin gene (a) and insuline like growth factor 1 gene (b). 
Hybridization signals are indicated by arrows. 
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Fig. 2. Ideogram of chromosomes carrying 
LEP and IGF1 loci. CFA = dog, VVU = red fox, 
ALA = arctic fox and NPP = Chinese raccoon 
dog. 





Results and discussion 

Comparative molecular studies of both genes in the four 
canids were carried out recently and their conserved structure 
was revealed (Chmurzynska et al., 2003; Klukowska et al., in 
press). Specific primers, described by the authors, facilitated 
selection of BAC clones carrying both genes. 

For all the four species positive hybridization results were 
obtained and the probes hybridized clearly to a single chromo¬ 
some pair (Fig. 1). Diagrammatic representations of all local¬ 


izations are shown in Fig. 2 and are summarized in Table 1. 
The IGF1 gene was previously mapped to canine chromosome 
15q24 (Breen et al., 2001b). Our study confirmed this localiza¬ 
tion. The other seven localizations are new localizations. In 
spite of the fact that in the dog genome there are over 300 loci 
mapped by FISH, the leptin locus is the second one, apart 
from the heterogeneous nuclear ribonucleoprotein A2/B1, 
HNRPA2B1, assigned to chromosome 14 (Breen et al., 2001b). 
In the arctic fox and Chinese raccoon dog both genes are the 
only ones mapped onto chromosomes 13 and 23, and 1 and 18, 
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respectively. The number of cytogenetically assigned loci 
reached 35 in the arctic fox and Chinese raccoon dog (Szczerbal 
et ah, this volume) and over 60 in the red fox (Serov and Rub- 
stov, 1998; Yang et ah, 2000). In the latter species approximate¬ 
ly half of the localizations were obtained with the use of the 
somatic cell hybridization technique. 


The localizations of the LEP and IGF1 genes in the dog, red 
fox and arctic fox are in agreement with earlier data obtained 
by the comparative chromosome painting approach for the dog 
and red fox (Yang et al., 1999) and the dog and arctic fox (Gra- 
phodatsky et al., 2000) genomes. 
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Abstract. New chromosomal assignments of canine-derived 
cosmid clones containing microsatellites to the Chinese rac¬ 
coon dog and arctic fox genomes are presented in the study. 
The localizations are in agreement with data obtained from 
comparative chromosome painting experiments between the 
dog and arctic fox genomes. However, paracentric inversions 


have been detected by comparing the loci order in canid karyo¬ 
types. The number of physically mapped loci increased to thir¬ 
ty-five both in the Chinese raccoon dog and in the arctic fox. 
Furthermore, the present status of the cytogenetic map of the 
Chinese raccoon dog and arctic fox is presented in this study. 

Copyright©2003 S. Karger AG, Basel 


Rapid progress of the physical dog genome map brings a 
unique opportunity to develop genome maps for other canids. 
There are two issues which make such studies reasonable. The 
physical map of other canid species facilitates a detailed analy¬ 
sis of the genome evolution in this family. Until now extensive 
comparative chromosome painting studies, with the use of 
human, canine and fox probes, were performed on the dog 
(Yang et al., 1999), red fox (Yang et al., 1999), arctic fox (Gra- 
phodatsky et al., 2000) and the Japanese raccoon dog (Grapho- 
datsky et al., 2001). This approach reveals major chromosome 
rearrangements which occurred in the course of karyotype evo- 
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lution, but is insensitive to detect minor intrachromosomal 
changes. Such minor rearrangements were recently presented 
for the dog chromosome 9, in comparison with the arctic fox 
chromosome 12 and the Chinese raccoon dog chromosome 5 
(Rogalska-Niznik et al., 2003). The second reason for the devel¬ 
opment of genome maps of other canids is the fact that some 
species are considered to be farm animals, namely the red fox, 
arctic fox and Chinese raccoon dog. Therefore, the maps may 
be useful in the identification of genes responsible for impor¬ 
tant productive traits, as it is the case in other farm animals. 

In the present paper we report on the FISH localization of 
twelve canine microsatellite-containing cosmid clones, in the 
Chinese raccoon dog and arctic fox genomes. 

Material and methods 

Probe source 

Fifteen canine cosmid clones, containing polymorphic microsatellites, 
described as CanBernl (Dolf et al., 1997a), CanBern2 (Dolf et al., 1997b), 
ZuBeCa4 (Dolf et al., 1998), ZuBeCall, ZuBeCal5, ZuBeCal8 (Schlapfer 
et al., 1999), ZuBeCal9, ZuBeCa21, ZuBeCa22, ZuBeCa23, ZuBeCa26 
(Schelling et al., 2000) and ZuBeCa31, ZuBeCa33, ZuBeCa35, ZuBeCa36 
(Dolf et al., 2000) were used in FISH experiments. 
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Fig. 1 . Partial Q-banded metaphase spreads of the Chinese raccoon dog (A) and arctic fox (B) and the same metaphases after 
FISH with ZuBeCa21 (a) and ZuBeCa31 (b) probes. Hybridization signals are indicated by arrows. 


Chromosome preparations and chromosome identification 

Chromosome spreads were obtained from routine short-term lympho¬ 
cyte cultures. Prior to hybridization, chromosomes were Q-banded using a 
0.005 % quinacrine mustard solution for 90 s. Chromosome nomenclature 
for the arctic fox (Makinen et al., 1985) and the Chinese raccoon dog (Pien- 
kowska et al., 2002b) was applied. 

Fluorescence in situ hybridization 

Cosmid DNA was labeled with biotin- 16-dUTP by random priming. 
Probes were denatured at 70 °C for 10 min, preannealed for 15 min and 
applied onto slides denatured in 70% formamide for 2 min. Hybridization 
was carried out overnight at 37 °C. Signal detection and amplification was 
performed using the avidin-FITC/anti-avidin system. Image capturing and 
processing were performed under a fluorescence microscope Nikon E600 
Eclipse equipped with a cooled digital camera, driven by computer-aided 
software LUCIA. 


Results and discussion 

All the canine cosmids, but three (ZuBeCal9, ZuBeCa22 
and ZuBeCa33), were successfully mapped to the Chinese rac¬ 
coon dog and arctic fox chromosomes (Fig. 1). In this study 
eight new assignments to the Chinese raccoon dog and nine to 
the arctic fox are presented (Table 1). As it is shown, all probes 
were previously localized by FISH also in the dog and red fox 
genomes. The results were compared with data obtained from 
chromosome painting experiments between the dog and the 
arctic fox (Graphodatsky et al., 2000). Our assignments are in 


full agreement with the comparative painting data. In the case 
of the Chinese raccoon dog we cannot draw such a conclusion 
since only the Japanese raccoon dog (Nyctereutes procyonoides 
viverrinus) was compared with the dog genome by the ZOO- 
FISH approach. It is worthwhile to mention that both subspe¬ 
cies of the raccoon dog differ significantly in their chromosome 
numbers, and thus it is not possible to extrapolate results 
obtained for one subspecies to the other one. The diploid chro¬ 
mosome number of the Chinese raccoon dog is 2n = 54 (+ 1-4 
Bs), while the Japanese raccoon dog has 2n = 38 (+ 1-7 Bs) 
chromosomes. 

The use of ZuBeCal9, ZuBeCa22 and ZuBeCa33 probes 
was unsuccessful and no specific signals on the arctic fox chro¬ 
mosomes were detected. According to Yang et al. (2000), no 
hybridization signals for ZuBeCa22 were observed on the dog 
and red fox chromosomes, while ZuBeCal9 hybridized to cen- 
tromeric regions of nine pairs of the small canine autosomes 
and gave no signals on the red fox chromosomes. On the other 
hand, ZuBeCa33 was assigned to the dog and red fox chromo¬ 
somes (Yang et al., 2000). Also an attempt to localize ZuBe- 
Ca33 in the Chinese raccoon dog chromosomes was unsuccess¬ 
ful. 

Chromosome localization of microsatellite markers facili¬ 
tated the detection of intrachromosomal rearrangements which 
occurred during karyotype evolution in the investigated spe¬ 
cies. The identified changes of the ZuBeCa23 and ZuBeCa25 
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Fig. 2. Two examples of intrachromosomal 
rearrangements which occurred during evolution 
in the karyotype of the Chinese raccoon dog 
(NPP), the dog (CFA) and the arctic fox (ALA). 
For abbreviations see legend to Fig. 3. 




Table 1. Comparative chromosomal localiza¬ 
tion of twelve canine-derived cosmid probes in 
four species of the family Canidae 


Microsatellite Chromosomal localization 
markers 



Dog 

(Yang et al., 2000; 
Dolfet al., 2000) a 

(Breen et al., 
1999) b 

Red fox 

(Yang et al., 2000; 
Dolfet al., 2000) 

Arctic fox 

(this study) 

Chinese raccoon dog 

(this study) 

CanBernl 

19q 17 

19q23 

5p22 

24ql5 d 

9q23 

CanBern6 

30q 16 

33ql5.3 c 

1 q21 

1 q21 

14q 19 

ZuBeCa4 

3q 16 

3q21.1 

14q23 

4q 14-16 d 

6q21 

ZuBeCal1 

18q 18 

18q24-25.3 

5q 12 

11 q 11-12 d 

17q 12 

ZuBeCal5 

24ql5 

24q23 

14p 15 

18q 13 

4pl5 e 

ZuBeCal8 

9q26 

9q26.1-26.2 

2pl4 

12q 12 

5ql5 e 

ZuBeCa21 

21 q 15 

21q22 

1 lp22 

15q 13 

20ql5 

ZuBeCa23 

17q 14 

17q 13 

8q23 

5p 14 

13q 17 

ZuBeCa26 

29ql2 

27qll c 

8p 12 

17q 11 

3pll e 

ZuBeCa31 

25q 16 

30q 15.1-15.2 C 

15q21 

2p21 

2ql2 

ZuBeCa35 

5q23 

5q32 

12q32 

10q22 

3q22 e 

ZuBeCa36 

9q26 

9q26.1-26.2 

2pl4 

12q 12 

5q 15 


a Original FISH localization data. 

b Localization following chromosome nomenclature endorsed by the ISAG (2000). 

c According to the data on reciprocal comparative chromosome painting, using human and canine probes, canine 
chromosomes 25, 29 and 30 in Yang’s nomenclature correspond to chromosomes 30, 27 and 33, respectively, in 
Breen’s nomenclature (Graphodatsky et al., 2000). 
d Rogalska-Niznik et al. (2000). 
e Szczerbal et al. (2003). 


order in the canine chromosome 17q, when compared with the 
arctic fox chromosome 5p and chromosome 13q of the Chinese 
raccoon dog, suggest a paracentric inversion event. A similar 
situation was also found in the dog chromosome 29, where 
there are two markers - Keratin2e and ZuBeCa26 - localized in 
close vicinity, whereas the same markers in the Chinese rac¬ 
coon dog and arctic fox are distant (Fig. 2). Additionally, the 
localization of CanBern6, ZuBeCal8 and ZuBeCa36 con¬ 
firmed the rearrangements described recently by Rogalska-Niz¬ 
nik et al. (2003). On the other hand, a detailed analysis of the 
markers assigned to chromosome 9 (CanBernl, ZuBeCa5 and 
ZuBeCa8) and 14 (ZuBeCal7, ZuBeCalO and CanBern6) of 
the Chinese raccoon dog revealed that ZuBeCalO resides on 
chromosome 14q, rather than on chromosome 9, as it was indi¬ 
cated earlier by Rogalska-Niznik et al. (2003). 


Altogether, to date thirty-five loci have been physically 
mapped in the Chinese raccoon dog and arctic fox genomes 
(Rogalska-Niznik et al., 2000, 2003; Szamalek et al., 2002; 
Pienkowska et al., 2002a; Szczerbal et al., 2003; this study). All 
of them were mapped with the use of the canine-derived 
probes. In the Chinese raccoon dog genome the mapped loci 
were assigned to fifteen chromosomes - out of twenty-seven 
present in the karyotype (Fig. 3a). In the arctic fox genome 
fourteen chromosomes - out of the twenty-five have got the 
assignments (Fig. 3b). It seems that further work on the cyto¬ 
genetic maps of both genomes should be focused on the local¬ 
ization of type I markers. Such an attempt was recently under¬ 
taken and the canine BAC clones, carrying leptin and insuline- 
like growth factor 1 genes, were mapped in the genomes of four 
canids (Szczerbal et al., this volume). 
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Fig. 3. Present status of the Chinese raccoon dog 
(a) and arctic fox (b) cytogenetic genome maps. Only 
chromosomes carrying mapped loci are shown. The 
following abbreviations were used: Z: ZuBeCa, C: 
CanBern, 5S: 5S rDNA, LEP: leptin gene, IGF1: insu- 
line-like growth factor 1, K2e: basic keratin gene clus¬ 
ter, K9: acid gene cluster. 
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Abstract. Effective utilization of the domestic cat as an ani¬ 
mal model for hereditary and infectious disease requires the 
development and implementation of high quality gene maps 
incorporating microsatellites and conserved coding gene mark¬ 
ers. Previous feline linkage and radiation hybrid maps have 
lacked sufficient microsatellite coverage on all chromosomes to 
make effective use of full genome scans. Here we report the 
isolation and genomic mapping of 304 novel polymorphic 
repeat loci in the feline genome. The new loci were mapped in 
the domestic cat radiation hybrid panel using an automated flu¬ 
orescent Taq -Man based assay. The addition of these 304 
microsatellites brings the total number of microsatellites map¬ 


ped in the feline genome to 580, and the total number of loci 
placed onto the RH map to 1,126. Microsatellites now span 
every autosome with an average spacing of roughly one poly¬ 
morphic STR every five centimorgans, and full genome cover¬ 
age of one marker every 2.7 megabases. These loci now provide 
a useful tool for undertaking full-genome scans to identify genes 
associated with phenotypes of interest, such as those relating to 
hereditary disease, coat color, patterning and morphology. 
These resources can also be extended to the remaining 36 spe¬ 
cies of the cat family for population genetic and evolutionary 
genomic analyses. 

Copyright©2003 S. Karger AG, Basel 


The last two decades have witnessed the development of 
comparative genomic analysis in the domestic cat (O’Brien and 
Nash, 1982; O’Brien et al., 1997, 1999, 2002). Past and present 
studies have utilized the feline model for identifying genes 
responsible for viral-induced cancer, infectious disease, heredi¬ 
tary disease, and coat coloration (O’Brien et al., 1986, 1999, 
2002; Fyfe et al., 1999; Parker et al., 2001; Eizirik et al., 2003). 
Because the domestic cat and the other 36 modern species of 
the cat family, Felidae, have evolved within the past 10 million 
years, genomic tools developed for the domestic cat can be 
directly applied to the conservation genetics and historical 
demography of exotic felids and the infectious diseases that 
afflict them (Brown et al., 1994; Carpenter and O’Brien, 1995; 
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O’Brien, 1995; Eizirik et al., 2001; Driscoll et al., 2002). Fur¬ 
ther, the remarkable conservation of synteny between the feline 
genome and other mammalian genomes has shed light on pat¬ 
terns of chromosomal rearrangement and genome evolution 
across a broader evolutionary spectrum (Rettenberger et al., 
1995; Murphy et al., 2001, 2003). 

The promise of further utilizing the feline genome as a mod¬ 
el organism has been driven by the recent development of mod¬ 
erate-level linkage and radiation hybrid maps (Menotti-Ray¬ 
mond et al., 1999, 2003; Murphy et al., 1999, 2000; Sun et al., 
2001). Previously published maps together contain 864 loci, 
cover all eighteen feline autosomes and both sex chromosomes, 
and are dominated by Type I coding gene markers (585 versus 
279 microsatellite loci), some of which were mapped in the 
interspecies backcross pedigree that could not be typed in the 
RH panel. Although present microsatellite coverage spans all 
feline chromosomes (Menotti-Raymond et al., 2003), several 
chromosomes or chromosome regions show a dearth of micro¬ 
satellites which will ultimately hamper the ability to undertake 
genome-wide linkage analysis towards identifying loci. 
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Towards filling this gap, we describe here the isolation and 
mapping of a novel set of 304 microsatellites, including 303 
dinucleotide loci and one trinucleotide repeat locus. Mapping 
and ordering of these loci in the feline 5,000-rad radiation 
hybrid panel (Murphy et ah, 1999) more than doubles the 
microsatellite marker coverage on each autosome. Marker den¬ 
sity is fairly uniform at approximately one microsatellite mark¬ 
er every five centimorgans (cM), and a total marker resolution 
of one locus every 2.7 megabases (Mb). The current feline 
radiation hybrid map now incorporates sufficient Type I and 
Type II markers to facilitate the identification of genes control¬ 
ling phenotypes of interest within the domestic cat and its exot¬ 
ic relatives, either through genome scans or candidate position¬ 
al cloning approaches. 


Materials and methods 

Library construction/screening 

A (dG-dT) n (dC*dA) n enriched microsatellite library was constructed 
from DNA of a male domestic cat (FCA215), and recombinants were 
sequenced as previously described (Sarno et al., 2000). A software tool was 
developed for processing of the sequences of recombinants which removed 
vector and linker sequences, discarded clones with fewer than ten uninter¬ 
rupted dinucleotide repeats, scanned our existing dinucleotide database and 
removed new or previous duplicates (Menotti- Raymond et al., 1999), and 
designed PCR primers using Primer 3.0 (Rozen and Shaletsky, 2000). Due to 
the frequency of these microsatellite motifs in the genome, it is not unusual to 
identify multiple microsatellites in a single recombinant clone, and this 
explains the presence of the trinucleotide repeat (FCA1054) recovered from 
this screen. 

Genotyping in the radiation hybrid panel 

Microsatellite loci were mapped in the cat/rodent radiation hybrid panel 
(Murphy et al., 1999) using the homogeneous 5'-nuclease assay of Van Etten 
et al. (1999). A program was developed to automatically score the results of 
the 5'-nuclease assay using the algorithm described by Van Etten et al. 
(1999), plot the results for visual examination, and export the data for linkage 
analysis. Loci which we were unable to map using this assay because they 
either gave positive signals in hamster DNA, or exhibited too many discor¬ 
dancies between the duplicate genotypes run for each primer pair, were geno- 
typed using agarose gels as previously described (Murphy et al., 1999, 2000). 
Amplifications were performed using a touchdown protocol and PCR condi¬ 
tions as described (Menotti-Raymond et al., 1999). A list of primers and con¬ 
ditions will be available at the LGD website (http://home.ncifcrf.gov/ccr/ 
lgd). All microsatellite sequences have been deposited in GenBank under 
accession numbers AY434736-AY435037. 

Map construction 

The new microsatellites were assigned to feline chromosomal linkage 
groups using the program RH2PT in the software package RHMAP 
(Boehnke et al., 1991). An initial LOD threshold of 8.0 was used and then 
subsequent groups assembled using decreasing LOD thresholds with 6.0 
being the minimum LOD threshold used for any group. Markers within each 
chromosome or linkage group were ordered using a reduction from the prob¬ 
lem of RH mapping to the traveling salesman problem (TSP) (Ben-Dor and 
Chor, 1997), as implemented in the software rh_tsp_map described by Agar- 
wala et al. (2000). The instances of TSP were solved by the linkern module of 
the software package CONCORDE (Applegate et al., 1998). These software 
packages have been recently used to construct the canine genome map 
(Guyon et al., 2003) and were evaluated in Hitte et al. (2003). We relied 
primarily on the maximum likelihood (MLE) reductions from RH mapping 
to TSP, as described in Menotti-Raymond et al. (2003). 

We constructed an MLE-weight consensus map by first removing mark¬ 
ers that were defined as “too close” (i.e., within three obligate breaks from a 
nearby marker). Using the remaining markers, we reduced each chromosome 
or chromosome arm marker set to five instances of TSP, solved those using 


CONCORDE, and translated the TSP solutions back to RH maps (Agarwala 
et al., 2000). The instances of TSP differ according to whether the MLE 
objective or the minimum obligate chromosome breaks (OCB) is used, and 
according to how the ambiguous positions in the RH vectors are interpreted. 
Markers included in the MLE-weight consensus map were those that were 
ordered consistently by all three interpretations of the MLE criterion using 
the map integration approach in (Agarwala et al., 2000), but we also noted 
those loci that got placed differently by MLE and OCB. Markers that were 
removed from MLE-weight consensus map construction for being “too 
close” were positioned into the constrained MLE-weight consensus map 
using the RHMAXLIK program in the RHMAP software package (Boehnke 
et al., 1991), which is based upon the maximum likelihood criterion. The 
resulting map was termed the “ordered MLE map”. Markers that were ini¬ 
tially removed because their placement differed in the three different MLE 
TSP solutions, were placed into “bins” between adjacent markers or at the 
terminus of a chromosome arm. To choose the best bin we considered adding 
each marker to the ordered MLE map before the first marker, between each 
pair of markers, and after the last marker using the “base MLE” criterion 
(Agarwala et al., 2000). Each marker was then assigned to the most likely 
interval within the ordered MLE map. 

Results 

Of the 746 microsatellite primer pairs designed and tested 
in cat and hamster controls, approximately 460 generated 
robust PCR products of expected size and were polymorphic in 
a panel of ten outbred domestic cats. Of these, 354 were suc¬ 
cessfully genotyped in the feline 5,000-rad RH panel. Of those 
loci genotyped in the RH panel, 50 loci were dropped for one of 
the following reasons: 1) insufficient 2-point LOD scores that 
would allow confident assignment to a particular linkage group, 
2) strong linkage to multiple chromosomes, or 3) the loci dis¬ 
played significantly elevated or reduced retention frequency 
relative to other linked markers. 

Three hundred and three novel dinucleotide repeat loci and 
a single trinucleotide repeat locus were integrated or placed rel¬ 
ative to the existing data set of RH vectors (Murphy et al., 
2000; Menotti-Raymond et al., 2003). The total number of loci 
located on the feline RH map is 1126, split between Type I 
(572) and Type II (554) loci. Of these, 1,085 (96%) were posi¬ 
tioned within the ordered MLE map using either rh_tsp_map/ 
CONCORDE (964 loci) or RHMAXLIK (121 loci), while 41 
loci were placed in their most likely inter-locus intervals using 
the binning approach. In the MLE-weight-consensus map, 96 % 
of the LOD scores between adjacent markers are >6.0. The 
remaining loci that had LOD scores <6.0 were largely found 
a) at the end of chromosome arms, b) near centromeres where 
increased retention frequency greatly reduces the average LOD 
score between markers, or c) between markers around the 
selectable locus on chromosome El. Average genome-wide 
locus retention is estimated at 39 %, in agreement with previous 
iterations of the feline RH map (Murphy et al., 2000; Menotti- 
Raymond et al., 2003). An additional six Type I loci and 26 
Type II loci are placed on the genetic map only, with a total of 
1,165 markers on at least one of the maps. 

Table 1 summarizes the current marker composition of the 
feline radiation hybrid map, partitioned by individual chromo¬ 
some. Marker coverage (both Type I and Type II) now averages 
one locus every 2.7 megabases (Mb), assuming a genome size of 
three billion base pairs. Autosomal marker coverage ranges 
from 93 loci on chromosome A1 (the largest feline autosome) to 
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Table 1. Summary of microsatellite and gene 
coverage in the feline RH map and the feline ge¬ 
nome 


Feline 

Chromo¬ 

some 

Number of Number of Total loci 
Type I loci Type II loci 

Mean Type 
II density 
(cM) 

Mean 

physical 

density 

(Mb) 

Physical 

length 3 

(Mb) 

Genetic 

length 

(cM) b 

RH length 
(cRsooo) 

Al 

37 

56 

93 

4.3 

3.1 

288 

242 

2096.9 

A2 

40 

33 

73 

6.0 

2.9 

210 

193 

1605.6 

A3 

33 

21 

54 

7.2 

3.1 

168 

151 

1359.8 

B1 

28 

54 

82 

4.9 

2.9 

231 

263 

1759.4 

B2 

38 

22 

60 

9.5 

2.9 

174 

208 b 

1310.5 

B3 

47 

19 

66 

8.2 

2.5 

168 

156 

1285.9 

B4 

38 

33 

71 

5.7 

2.3 

162 

187 

1295.5 

Cl 

49 

39 

88 

5.4 

2.9 

258 

209 

1954.7 

C2 

29 

43 

72 

3.9 

2.4 

174 

166 

1397.0 

D1 

35 

39 

74 

3.0 

2.0 

144 

115 

1432.9 

D2 

20 

28 

48 

5.0 

2.5 

120 

140 

995.7 

D3 

19 

26 

45 

7.4 

2.5 

120 

206 

1253.2 

D4 

16 

24 

40 

4.0 

2.7 

111 

100 

1027.8 

El 

34 

19 

53 

4.0 

2.1 

111 

76 

1308.2 

E2 

19 

28 

47 

4.3 

1.9 

90 

121 

726.2 

E3 

24 

7 

31 

9.6 

2.2 

69 

67 

662.9 

FI 

18 

24 

42 

2.2 

2.1 

87 

53 

783.8 

F2 

7 

27 

34 

4.7 

2.6 

87 

127 

637.8 

X 

33 

10 

43 

18.0 

3.5 

150 

180 b 

864.2 

Y 

8 

2 

10 

N.A. 

6.9 

69 

NA 

189.7 

Total 

572 

554 

1126 

5.3 

2.7 

2991 

2960 

23947.7 


From Murphy et al. (2000), based on the fraction of the cytogenetic map, listed in Menotti-Raymond et al. 

(1999). 

b Genetic lengths (Kosambi cM) were re-estimated based on the cytogenetic map due to poor microsatellite 

coverage in the linkage map. 


31 on chromosome E3 (the smallest feline autosome), with cov¬ 
erage being roughly proportional to chromosome size. 

Genome-wide microsatellite coverage has more than dou¬ 
bled with respect to the maps in Menotti-Raymond et al. 
(2003), providing an average density of one microsatellite locus 
every 5.3 cM. However, density varies across chromosomes, 
ranging from a high of one microsatellite marker every 2.2 cM 
on feline chromosome FI to one marker every 18 cM on the X 
chromosome (Table 1). Only three novel X-linked microsatel¬ 
lites were added to the current map, reflecting the lower than 
average recovery of polymorphic X chromosome microsatel¬ 
lites in mammalian screens (Dietrich et al., 1996, Dib et al., 
1996). In addition, the present microsatellite screen was per¬ 
formed on male genomic DNA, thus reducing the probability 
of X-linked locus isolation. It is notable that of 11 Y-linked 
microsatellites isolated from the current screen (confirmed by 
PCR testing on multiple male and female genomic DNAs), all 
but one locus (FCA1052) showed evidence of more than two 
alleles in PCR of individual genomic DNAs, consistent with 
amplification of a multicopy locus (data not shown). These ten 
multicopy Y-linked loci likely represent members of near iden¬ 
tical amplicons, similar to those that have been shown to popu¬ 
late the human Y chromosome (Kuroda-Kawaguchi et al., 
2001 ). 

As an illustration Fig. 1 displays integrated radiation hybrid 
and linkage maps of feline chromosome D1, plus human com¬ 
parative sequence-based conserved segments. This example 
illustrates the consistent marker order achieved between the 
RH and linkage maps for most chromosomes, and the in¬ 
creased microsatellite density. Several chromosomes showed a 


significant increase in microsatellite coverage relative to pre¬ 
vious maps (Menotti-Raymond et al., 2003), most notably 
chromosomes B2 (from 5 to 23 loci) and FI (from 4 to 24 loci), 
where polymorphic markers now cover the majority of each 
chromosome. Discrepancies still exist between marker orders 
derived from the two mapping approaches (e.g., A2 and B4; see 
Menotti-Raymond et al., 2003 for discussion), but most in¬ 
volve rotation of pairs of closely linked loci on one or both 
maps. Full RH map tables can be viewed as supplemental 
information (www.karger.com/doi/10.1159/000075762). 

Discussion 

This study presents a third generation radiation hybrid map 
of the feline genome, in which 304 novel microsatellite loci 
have been mapped across all feline chromosomes. The current 
RH map shows marked improvement over previous feline gene 
maps (Murphy et al., 2000; Menotti-Raymond et al., 2003), 
now incorporating 572 Type I coding gene loci and 554 poly¬ 
morphic Type II microsatellite loci, with a substantial enrich¬ 
ment of microsatellite markers on nearly all feline chromo¬ 
somes. Most notably, microsatellite density has substantially 
increased on two chromosomes, B2 and FI, which previously 
were significantly underrepresented compared to autosomes of 
similar physical size (Menotti-Raymond et al., 1999, 2003; 
Murphy et al., 2000; Sun et al., 2001). 

Together with an additional 26 microsatellites ordered in 
the interspecies linkage map (Menotti-Raymond et al., 1999) 
that failed to map in the RH panel, the total number of micro- 
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Fig-1 . Radiation hybrid (RH) map of feline chromosome D1 (FCADI), 
cross-referenced to the feline interspecies hybrid linkage map, and the 
human chromosome 11 (HSA11) sequence-based comparative maps. Micro¬ 
satellites are denoted in boldface and by the prefix FCA or an F, followed by a 
number. Type I coding loci are given by their human gene homologue sym¬ 
bols, or in the case of feline expressed sequence tags (ESTs), preceded by an 
Fc. and followed by the human homologue’s UniGene cluster identifier (see 
associated data at http://lgd.nci.nih.gov.). GL framework distances (sex-aver¬ 


aged) are given in centimorgans, with intermarker distances shown to the 
right. The scale for the RH map corresponds to 1 bar every 100 cRsooo- Blocks 
of conserved marker order with the human genome sequence assembly are 
shown to the right, based upon BLAST comparison of feline homologues 
with the sequence assembly (NCBI) Build30. Terminal marker locations of 
each conserved segment given in megabases. Feline markers physically local¬ 
ized to cat chromosome D1 using a rodent/cat somatic cell hybrid panel 
(O’Brien et al., 1997; Menotti-Raymond et al., 1999) are underlined. 


Cytogenet Genome Res 102:272-276 (2003) 


275 
















































satellite loci mapped in the feline genome stands at 580. As a 
result, marker density has increased from one polymorphic 
microsatellite locus every 10 cM to one every 5.1 cM. Addition¬ 
al microsatellite coverage will be obtained by linkage mapping 
of the 159 polymorphic microsatellites that failed to map in the 
RH panel in this study (Eizirik et al., 2003; Menotti-Raymond 
et al., unpublished data). 

Isolation of sex-chromosome specific polymorphic markers 
still remains a limitation of the developing feline gene maps. At 
one locus positioned roughly every 18 centimorgans, the X 
chromosome represents as an outlier with respect to genome¬ 
wide microsatellite coverage (Table 1). This is not surprising 
having used male domestic cat DNA in the current microsatel¬ 
lite screen. An X-linked enriched library generated from flow- 
sorted chromosomes is currently under development. Further¬ 
more, Y microsatellite discovery appears to be limited by the 
abundance of apparent multicopy loci isolated in our current 
screens of male genomic DNA. This deficiency too will likely be 
remedied by sequencing from Y chromosome specific libraries 
in regions of single copy gene abundance. 


With recent refinements in both the feline-human compara¬ 
tive map (Murphy et al., 2000, 2003; Menotti-Raymond et al., 
2003), and the current increase in microsatellite map density, 
feline genomics is now poised to take advantage of full genome 
scans and candidate gene approaches towards identifying genes 
causative in hereditary pathologies modeling human disease. 
Future goals include refinement of the feline-human compara¬ 
tive map in light of a finished human genomic sequence, fur¬ 
ther increasing microsatellite density to 3 Mb resolution, and 
closing gaps in sex chromosome coverage. These new domestic 
cat mapping resources will also provide a valuable genetic 
resource for identifying genes controlling reproductive traits, 
pattern morphologies, and other phenotypes of interest in exot¬ 
ic felids. 
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Abstract. The alignment of genome linkage maps, defined 
primarily by segregation of sequence-tagged site (STS) markers, 
with BAC contig physical maps and full genome sequences 
requires high throughput mechanisms to identify BAC clones 
that contain specific STS. A powerful technique for this pur¬ 
pose is multi-dimensional hybridization of “overgo” probes. 
The probes are chosen from available STS sequence data by 
selecting unique probe sequences that have a common melting 
temperature. We have hybridized sets of 216 overgo probes in 
subset pools of 36 overgos at a time to filter-spotted chicken 
BAC clone arrays. A four-dimensional pooling strategy, includ¬ 
ing one degree of redundancy, has been employed. This 


requires 24 hybridizations to completely assign BACs for all 
216 probes. Results to date are consistent with about a 10% 
failure rate in overgo probe design and a 15-20% false negative 
detection rate within a group of 216 markers. Three complete 
rounds of overgo hybridization, each to sets of about 39,000 
BACs (either BamHl or EcoRl partial digest inserts) generated 
a total of 1853 BAC alignments for 517 mapped chicken 
genome STS markers. These data are publicly available, and 
they have been used in the assembly of a first generation BAC 
contig map of the chicken genome. 

Copyright©2003 S. Karger AG, Basel 


Several agricultural animal genomes now have linkage maps 
on which DNA-based markers have been aligned at resolutions 
approximating 1 cM. Many of those markers are sequence-tag¬ 
ged sites (STS), typically microsatellite and single nucleotide 
polymorphism (SNP) markers, including several within spe¬ 
cific genes. For example, a consensus genetic linkage map has 
been generated for the chicken genome (Groenen et al., 2000; 
Schmid et al., 2000) consisting of 1,965 loci ordered on 50 link¬ 
age groups that together span about 3,800 cM. Approximately 
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half of these markers derive from STS loci. Genetic linkage 
maps are essential for the detection and analysis of quantitative 
trait loci (QTL), along with other studies in the areas of breed¬ 
ing and genome evolution. 

Physical maps of agricultural animal genomes have as¬ 
sumed increasing importance. Most of these are based on col¬ 
lections of continuous sets of overlapping DNA inserts carried 
in bacterial artificial chromosome (BAC) vectors, so-called 
“BAC contigs”. BAC contigs form the platform on which full 
genome sequences generally are assembled (Green, 2001). For 
the chicken genome, both local and genome-wide BAC contig 
maps recently have been assembled (Crooijmans et al., 2003; 
Ren et al., 2003), and these are in the process of being refined. 
Positional cloning of QTL allele-containing genes and a variety 
of other applications require accurate alignment of a species 
linkage map with a physical, BAC-contig map. Such an align¬ 
ment also provides a means of ordering BAC contigs and 
assigning the contigs to linkage groups and chromosomes. This, 
in turn, is essential for the accurate assembly of the complete 
genome sequence (Hoskins et al., 2000; Green, 2001). 

Crooijmans et al. (2000) described the construction of a 
5.5x BAC library with inserts of i/z>?dIII-digested DNA from a 
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White Leghorn (WL) chicken and its screening by PCR analysis 
of pooled BAC DNA templates. A larger BAC library (~ 15x) 
composed of three sublibraries of BamHl, EcoRl and Hindlll 
partial digest inserts, respectively, from an inbred (UCD001) 
Red Jungle Fowl has also been described (Lee et ah, 2003). 
Another alternative to PCR screening for identification of 
BACs containing STS markers is the use of “overgo” hybridiza¬ 
tion probes (Cai et al., 1998; Ross et al., 1999). Overgo probes 
are 36-44 bp double-stranded oligonucleotides, labeled in 
vitro, that are designed for efficient pooling and high through¬ 
put hybridization. Here we describe the application of multi¬ 
dimensional overgo probe hybridization to the integration of 
the chicken genetic linkage map and its BAC contig map (Ren 
et al., 2003). We demonstrate that this is an efficient approach 
that can be applied to any species for which there exists a signif¬ 
icant amount of STS data, including microsatellite and SNP 
markers, expressed sequence tags (ESTs) or BAC insert end 
sequences. 


Methods and materials 

BAC libraries 

The BamHl and EcoRl sublibraries described by Lee et al. (2003) were 
used. On-going analysis of the Hindlll BAC sublibrary (Lee et al., 2003) and 
another UCD001 BAC library, CHORI-261 (Children’s Hospital of Oakland 
Research Institute, Oakland, CA, http://bacpac.chori.org/chicken261.htm), 
has provided similar results. BAC clone array filters were purchased from 
Genefinder Genomic Resources (http://hbz.tamu.edu, Texas A&M Univer¬ 
sity, College Station, TX, USA). 

Overgo probe design 

Overgo probes are comprised of two synthetic oligonucleotides that typi¬ 
cally contain an 8-bp region of complementarity (see http://www.tree.cal- 
tech.edu/protocols/pictures/overgo.jpg). The resulting single-stranded por¬ 
tions of the overgo are end-filled with Klenow fragment DNA polymerase in 
the presence of labeled deoxynucleotide triphosphates, leading to a radioac¬ 
tive, double-stranded DNA fragment of 38-44 bp. Each overgo was designed 
for a chicken STS marker, beginning with sequence information obtained 
from GenBank (http://www.ncbi.nlm.nih.gov/Genbank/index.html). An im¬ 
portant consideration in overgo design is the elimination of any repetitive 
sequences. When an annotated chicken gene sequence is used, intron or 
untranslated (UTR) sequence is preferable, in that order, to minimize the 
chance of cross-hybridization between members of gene families. When only 
cDNA sequence (e.g., EST data) is available, 3' UTR sequence is preferred 
due to its greater divergence between gene family members and the low likeli¬ 
hood that it contains an intron (as discussed in Suchyta et al., 2001). If only 
the coding region of a cDNA sequence produced a suitable overgo, the 
intron-exon arrangement of the homologous human gene was examined to 
minimize the chance that the chosen overgo sequence from the cDNA might 
span an intron in the chicken genome. After this selection step, the available 
sequence information for a marker was masked to eliminate known repeti¬ 
tive regions using the RepeatMasker script (http://repeatmasker.genome. 
washington.edu/cgi-bin/RepeatMasker) and then entered into the Overgo 
1.02i program (Cai et al., 1998; http://www.mouse-genome.bcm.tmc.edu/ 
webovergo/OvergoInput.asp). Overgo probe melting temperatures typically 
were chosen between 67.5 and 69.9 0 C using the Integrated DNA Technolog¬ 
ies Oligo Analyzer program (http://207.32.43.70/biotools/oligocalc/oligo- 
calc.asp). Each overgo sequence was then checked again for potential cross¬ 
hybridization to multiple regions within the chicken genome using the 
BLASTN program (http://www.ncbi.nlm.nih.gov/blast). Finally, each overgo 
sequence was examined manually to eliminate any possible remaining hair¬ 
pins and to ensure that it contained sufficient sites for labeling by [ 32 P]-dATP 
and -dCTP (preferably at least ten such sites per overgo). If fewer than three 
G-C bp occurred in the 8-bp overlap region, the length of overlap was 
increased to 9 bp to insure stable association between the two oligonucleo- 



Fig.1 . Three-dimensional design of BAC filter screening using 216 over- 
go probes. Each of six plates (lower case Roman numerals) contains 36 cells 
in six rows (capital letters) by six columns (Arabic numerals), and each cell 
represents a single overgo probe. Hybridizations are done as plate, row and 
column pools. For example, BACs that uniquely are complementary to over- 
go B2i will be those BACs detected by the pool of all “B” overgos (one of six 
column pools), all “2” overgos (a row pool) and all “i” overgos (a plate pool). 
We also employ a fourth diagonal dimension (e.g., the first diagonal pool 
includes Ali to F6i, then A2ii to E6ii plus Flii, etc., diagonally back through 
the block) to provide one degree of redundancy to the overgo detection pro¬ 
cess. Any BACs detected by the B2i overgo should also be among those 
detected by this first diagonal pool, as well. BACs detected by three out of 
four of the appropriate pools are examined manually as potential candidate 
positives. 


tidesduring the in vitro labeling step. Overgo primers were ordered from 
Integrated DNA Technologies. Overgo sequences are available from the 
authors on request. 

Overgo hybridization 

A 6x6x6 matrix of 216 probes was employed for each round of overgo 
screening. Three-dimensional overgo pools (18 pools of 36 overgos each) 
were constructed as described in Fig. 1. Six additional pools were hybridized 
(diagonals of the matrix in Fig. 1) to provide a redundant fourth dimension 
hybridization. Overgos were labeled as described (Ross et al., 1999; http:// 
www.tree.caltech.edu/protocols/overgo.html) with the following modifica¬ 
tions. Pairs of overgo oligonucleotides (10 pmol each) were annealed as 
described in 10 pi H 2 O, followed by incubation in a 25-pl reaction containing 
0.1 mg/ml bovine serum albumin, 5 pi overgo labeling buffer (OLB), 2.5 pi 
each of [a- 32 P] dATP and dCTP (10 mCi/ml, ~ 3.3 pM, ICN Pharmaceuti¬ 
cals) and 2 U Klenow fragment DNA polymerase (New England Biolabs) at 
room temperature for one hr. OLB was as described (Ross et al., 1999; http:// 
www.tree.caltech.edu/protocols/overgo.html ). Deoxynucleotide incorpora¬ 
tion generally ranged between 10 and 50 %, leading to overgo specific activi¬ 
ties of 0.5-2.5 x 10 6 dpm/pmol of overgo oligonucleotide. Filters were indi¬ 
vidually prehybridized in 10 ml of hybridization solution at 60 °C for 1-2 h, 
followed by a change of hybridization solution if filters were used for the first 
time. Labeled overgos were pooled in 10 ml of prewarmed hybridization 
solution per filter and added to the filters followed by incubation with rota¬ 
tion at 60 °C overnight (not less than 12 h). Filters were washed twice in 4x 
SSC, 0.1 % SDS at room temperature for 5 min and three times in 2x SSC, 
0.1 % SDS at 50 0 C for 15 min. Filter arrays were reused no more than twice. 
Prior to reuse they were stripped in O.lx SSC, 0.1% SDS at 70 °C for 
30 min. 

Following hybridization and washing, BAC filters were exposed 2-3 d to 
a phosphoimager screen, and hybridization patterns were obtained using a 
Storm® gel and blot imaging system (Molecular Dynamics, Amersham Bio¬ 
sciences). An example is shown in Fig. 2. All BACs were double-spotted in a 
specific pattern. Positive BACs were called manually. A Microsoft Access®- 
based program was used to analyze the results of all hybridization pools con- 
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Fig. 2. Hybridization results for one filter (of two filters total) containing 
the EcoRl BAC library to one pool of 36 overgo probes. See Methods and 
materials for details. Note that each positive BAC generates a characteristic 
two spot pattern. 


taining a specific marker. Each array image for any BAC that was called 
positive for at least three of its four pools was re-examined by eye to confirm 
the assignment. Approximate quality estimates for each assignment were set 
as: P, probable (strong hybridization in all four appropriate pools); T, tena- 
tive (faint hybridization in one or two of the four pools); and W, weak (faint 
hybridization in more than two of the four pools or [rarely] no hybridization 
detected in one of the four pools, but strong hybridization in the other three). 


Table 1 . Distribution of overgo markers screened and markers with one 
or more BAC assignments (in parentheses) by chicken chromosome or link¬ 
age group. Chromosome assignments and nomenclature are according to 
Schmid et al. (2000) and Paton et al. (2003). 


Chromosome 
(linkage group) 

BamHl sublibrary 

EcoRl sublibrary 
(screen 1) 

EcoRl sublibrary 
(screen 2) 

GGA1 

121 (94) 

39 (32) 

33 (29) 

GGA2 

95 (77) 

7(4) 

7(7) 

GGA3 

— 

33 (31) 

12 (12) 

GGA4 

— 

31 (29) 

10(10) 

GGA5 

— 

19 (19) 

11(7) 

GGA6 

— 

4(3) 

12 (12) 

GGA7 

— 

3(2) 

16(16) 

GGA8 

— 

3(2) 

8(7) 

GGA9 

— 

2(2) 

4(3) 

GGA10 

— 

4(4) 

8(8) 

GGA11 

— 

3(3) 

6(4) 

GGA12 

— 

2(2) 

7(7) 

GGA13 

— 

2(2) 

4(4) 

GGA14 

— 

3(2) 

7(5) 

GGA15 

— 

4(4) 

6(6) 

GGA16 

— 

2(0) 

1(1) 

GGA17 

— 

— 

2(2) 

GGA18 

— 

2(2) 

6(6) 

GGA19 

— 

2(1) 

6(5) 

GGA23 

— 

1(1) 

5(5) 

GGA24 

— 

2(2) 

15(6) 

GGA26 

— 

3(3) 

5(4) 

GGA27 

— 

2(2) 

6(2) 

GGA28 

— 

4(2) 

8(7) 

GGAZ 

— 

9(7) 

6(5) 

GGAW 

— 

5(5) 

— 

C24 

— 

— 

1(1) 

C37 

— 

1(1) 

— 

E22C19W28 

— 

1(1) 

— 

E26C13 

— 

3(1) 

— 

E32E47W24 

— 

5(4) 

2(2) 

E38 

— 

3(1) 

— 

E50C23 

— 

2(2) 

— 

E54 

— 

2(2) 

2(2) 

E57 

— 

1(0) 

— 

unlinked 

— 

5(3) 

— 

Total 

216 (171) 

214(181) 

216(185) 


Results and discussion 


Results reported here derive from three rounds of hybridi¬ 
zation with 216 overgo probes each, arrayed as shown in Fig. 1, 
with example results shown in Fig. 2. The first round employed 
the BamHl sublibrary (Lee et al., 2003) using 216 probes 
derived from chicken STS markers on chromosomes 1 and 2. 
The second round screened the EcoRl BAC sublibrary with 
overgo probes that were broadly distributed across 33 linkage 
groups (Schmid et al., 2000; Paton et al., 2003). Of these 216 
probes, 42 overgos had been employed in the earlier BamHl 
library screen but were re-used because they failed in the first 
round or additional complementary BACs were sought. Of the 
23 of these 42 probes that failed to identify a BAC in the Bam¬ 
Hl library, 18 overgos hybridized to at least one £<%>RI BAC. 
The third round of hybridization was also to the EcoRl BAC 
sublibrary, involving 216 overgos that were distributed across 
28 linkage groups. Two of these were new overgos designed 
from different sequences than those used in the previous EcoRl 
screen. In total, our three overgo hybridization sets aligned a 
total of 1,853 BAC clones with 517 mapped chicken genome 
STS markers. The first round of overgo hybridization to the 


BamHl sublibrary generated a total of 481 BAC assignments 
covering 171 markers, the first £coRI screen generated 686 
assignments for 181 markers and the second EcoRl screen 
resulted in 695 assignments for 185 markers. (As noted above, 
some markers were tested in more than one screen, so individu¬ 
al screen results don’t add up to the totals given above.) A sum¬ 
mary of the genome coverage of these marker-BAC assign¬ 
ments is shown in Table 1. Overgo-BAC assignments have been 
obtained for all chicken chromosomes and most linkage groups. 
A few small linkage groups (Schmid et al., 2000) lack any mark¬ 
ers for which overgo probes could be designed. One linkage 
group, E57, generated only a single usable overgo that, to date, 
has not yielded a BAC assignment. All BAC assignments ob¬ 
tained are available at http://poultry.mph.msu.edu/resources/ 
Resources.htm#bacdata. This site will be updated as additional 
overgo-BAC assignments are completed. This list also includes 
marker-BAC assignments that we obtained by non-overgo hy¬ 
bridization techniques (Lee et al., 2003) and assignments con¬ 
tributed by other labs using a variety of methods. 
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Table 2. Distributions of overgo markers according to the numbers of 
BAC assignments per marker obtained in different experiments 


Frequency of overgos detecting the number of BACs shown below 



0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

>9 

Round 1 

BamHl 

44 

59 

35 

26 

21 

15 

8 

2 

0 

3 

2 

Round 2 

-EcoRI 

34 

22 

40 

28 

33 

23 

17 

8 

4 

2 

4 

Round 3 
£coRI 

31 

28 

31 

36 

32 

23 

16 

8 

5 

3 

3 

Simulation l a 

30 

42 

51 

41 

24 

12 

5 

2 

1 

<1 

<1 

Simulation 2 b 

28 

22 

37 

42 

36 

25 

14 

7 

3 

1 

<1 


Simulated data for a 216 overgo set, assuming a 10% (22) failure rate in overgo 
design, effective library coverage of 3x and a false negative rate for identifcation 
of true overgo-BAC matches of 20% (i.e., 80% success rate). 
b Simulated data for a 216 overgo set, assuming a 10% (22) failure rate in overgo 
design, effective library coverage of 4x and a false negative rate for identification 
of true overgo-BAC matches of 15% (i.e., 85% success rate). 


The results of the first two overgo screens were integrated 
into the initial round of BAC contig assembly for the chicken 
genome physical map (Ren et ah, 2003). Of the 1,124 BAC 
assignments from these two screens, 1,069 were located in 361 
of the 2,331 contigs in the current map. Fifty-one of these con- 
tigs were found to contain two or more of the 350 DNA mark¬ 
ers that gave rise to these assignments. The locations of the 
common markers in the chicken consensus linkage map 
(Groenen et al. 2000; Schmid et al. 2000) were in agreement 
with their assignments to a single BAC contig. This suggests 
that at least these overgo-BAC assignments are accurate and 
consistent with the fingerprint-based physical map. BAC map¬ 
ping data provide a critical adjunct to fingerprint analysis in 
the assembly of BAC contig maps. In this case, including the 
overgo BAC assignments described above nearly doubled the 
average contig size of the chicken physical map and greatly 
increased the number of large contigs in the map (http://hbz7. 
tamu.edu/homelinks/phymap/chicken/chick_home.htm; Ren 
et al., 2003). 

There are three major factors that determine the success of 
overgo mapping. The first is overgo design itself. Some overgos 
may not hybridize or may fail to adequately incorporate 
radioactive label. (In our experience, the latter problem is rare.) 
For some markers, especially chicken microsatellites, only very 
small regions of non-repetitive sequence are available, so one 
may have to relax the overgo design criteria or, occasionally, a 
marker cannot be used at all. Furthermore, almost all available 
chicken sequences derive from WL lines, whereas the BAC 
library derives from UCD001 Red Jungle Fowl DNA. We have 
shown previously that UCD001 sequence differs by about 1 % 
from typical WL sequence, and most of the differences are 
interspersed SNP (Okimoto and Dodgson, 1997; Suchyta et al., 
2001). These sequence differences should not normally prevent 
cross-hybridization of a WL overgo sequence to UCD001 BAC 
insert DNA, but it is possible that a few overgos fail for this 
reason. A small number of overgos also fail because they 
hybridize to too many BACs. Due to the limited amount of 
chicken genome sequence available (especially at the start of 


this project), it is not surprising that there are unknown repeti¬ 
tive sequences that could not be excluded in the overgo design 
process. Indeed, we typically observe that 1-2% of our overgo 
probes hybridized to too many BAC clones to be useful or 
meaningful. In total, we estimate that about 10% of our overgo 
probes fail, regardless of whether or not any complementary 
BACs are present on the filters tested. The second factor that 
determines overgo success is library quality and size. Due to 
financial limitations, each overgo group was hybridized to only 
a subset (about 1/3) of the total BAC library. Based on average 
insert size, the BamHl and FToRI sublibraries each are esti¬ 
mated to contain nearly 5-fold coverage of the chicken haploid 
genome, but hybridization with gene fragments to the BamHl 
sublibrary demonstrated that its effective coverage is probably 
significantly less (Lee et al., 2003). The third factor is the likeli¬ 
hood that a properly designed overgo will actually detect a com¬ 
plementary BAC, assuming that this BAC is present among the 
arrayed clones. False negatives may occur at this stage because 
of errors in robot spotting of arrays, failures during the BAC 
clone growth process, errors in the hybridization or washing 
protocols, background spots that obscure true hybridization or 
lead to miscalls or other errors in the image analysis and data 
handling process. 

One can obtain a rough estimate of the contributions of the 
three types of failure described above by looking at the distribu¬ 
tion of markers according to the number of BACs that each 
detects (Table 2). Non-functional overgo probes lead to mark¬ 
ers that detect no BACs under any circumstances, regardless of 
the quality of the library or filters. The number of complemen¬ 
tary BACs that contain any functional single copy overgo on 
the filter array can be approximated by a Poisson distribution 
(rn n e _m /n!, where m is the average genome coverage and n is the 
number of complementary BACs for a given marker). Finally, 
the actual number of BACs detected by a given overgo is the 
number that exist in the tested array multiplied by a binomial 
distribution based on the chance of successful detection of all 
positive BACs. For example, if two positive BACs exist on the 
filters and the average chance of successful detection per BAC 
is 80%, then the chance of identifying both BACs is 0.8 * 0.8 = 
0.64, the chance of detecting one BAC is 2 x (0.8 x 0.2) = 0.32, 
and the chance of detecting neither is 0.2 x 0.2 = 0.04. 

Table 2 shows the number of markers that detected a given 
number of BACs for each of our three sets of 216 overgos. It 
also includes two simulations of the distribution of BACs 
expected, assuming 10% (22) of the overgos fail in the design 
stage. The first simulation uses an effective library size of 3x 
and a random false negative rate in the detection phase of 20 %. 
This is a reasonable fit for our screen of the BamHl sublibrary, 
although the actual false negative rate may have been above 
20% in this first trial. The other simulation uses an effective 
library size of 4x and a random false negative rate in the detec¬ 
tion phase of 15%. This provides a good fit to both screens of 
the FToRI sublibrary. The improved accuracy of detection in 
these latter two screens probably derives from the experience 
we gained in the first screen and from the use of a different 
robot and spotting pattern for the ifcoRI array filters. Note that 
for all three screens we observed more markers that hybridize 
to eight or more BACs than expected by the simulated predic- 
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tions. This may suggest that a few overgos hybridize to more 
than one member of a multigene family. 

The four-dimensional overgo array with one degree of 
redundancy that we’ve employed is only one of several possible 
approaches. The redundant hybridizations (six additional 
screens in addition to the 18 minimally required), along with 
accepting three out of four positives as a criterion for further 
analysis, leads to “rescue” of what otherwise would be many 
false negatives. Moreover, the redundant data also often allow 
for resolution of BACs that would otherwise appear to hybrid¬ 
ize to two different, unlinked overgos. These potential false 
positives may arise due to low level cross-hybridization or to 
insufficient washing of filters between reuse. A two-dimension¬ 
al strategy involving similar pool sizes (36 x 36) would require 
only three times the number of hybridizations (72) to detect 
BACs for six times the number of overgos (36 2 = 1,296). How¬ 
ever, the number of false negative results (and possibly false 
positives) probably would increase considerably. A more desir¬ 
able alternative for greater throughput might be to increase the 
pool size but to retain the three-dimensional pooling strategy 
(e.g., 8x8x8) along with the use of a redundant hybridization 
set to improve data quality. 


In summary, overgo hybridization provides a convenient 
and efficient means to integrate linkage maps of animal agricul¬ 
tural species with BAC contig maps and, eventually, full 
genome sequence assemblies. In addition, these data provide 
an essential adjunct to the assembly of high quality physical 
maps and genome sequences. In our hands, the accuracy and 
efficiency of overgo analysis is superior to PCR-based screening 
of BAC DNA pools, although either PCR screening or individ¬ 
ual probe hybridization screens (Crooijmans et al., 2000; Lee et 
al., 2003) are more convenient when only a few markers or 
genes are to be analyzed. The overgo method is ideal when large 
numbers of probes are to be assigned in a high throughput 
fashion. 
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Abstract. The karyotypes of marsupial species are charac¬ 
terized by their relatively low number of chromosomes, and 
their conservation. Most species have diploid numbers lying 
between the two modes, 2n = 14 and 2n = 22, but the karyotype 
of Aepyprymnus rufescens is exceptional in containing 2n = 32 
chromosomes. Many differences in diploid number between 
marsupial species can be accounted for by particular fissions 
and fusions, which are easy to detect because of the low num¬ 
bers of chromosomes in each karyotype. This should be a sys¬ 
tem in which it is possible to detect reversals and repeated chro¬ 


mosome rearrangements. We have used chromosome-specific 
paints derived from A. rufescens to compare the karyotypes of 
eight marsupial species, representing closely and distantly relat¬ 
ed taxa, to trace chromosome change during evolution, and 
especially to detect reversals and convergence. From these and 
other painting comparisons, we conclude that there have been 
at least three reversals of fusions by fissions, and at least three 
fusions or fissions that have occurred independently in differ¬ 
ent lineages. 

Copyright©2003 S. Karger AG, Basel 


Marsupials (viviparous, short-lived placentation) are classi¬ 
fied together with eutherians (viviparous, extended placenta¬ 
tion) and monotremes (egg laying) in the class Mammalia. 
Some 270 living marsupial species (divided into 16 recognized 
families) can be found in America, Australia, New Guinea and 
the East Indies. The only naturally occurring marsupial in the 
United States is the possum, Didelphis virginiana. In South 
America and Australia, however, marsupials are an important 
group of land mammals (Graves and Westerman, 2002). Mar¬ 
supial species separated from eutherians around 180 million 
years ago and the American species diverged from the Austra¬ 
lian species around 70 million years ago (Woodburne et al., 
2003). Although marsupials are an ancient group of mammals 
and fill diverse niches, their karyotypes are highly conserved 
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around two modal numbers, 2n = 14 and 2n = 22 (Sharman, 
1961; Hayman and Martin, 1974). The widely accepted view 
on marsupial karyotype evolution is that the ancestral stock has 
a 2n = 14 karyotype represented by members of several families 
in each order (Rofe and Hayman, 1985; reviewed by Graves 
and Westerman, 2002), and that the 2n = 22 karyotype is 
derived by fissions. Other lower numbers are then derived 
through fusions. However, the converse theory, that a 2n = 22 
karyotype was ancestral and the 2n = 14 karyotype was formed 
by fusions is favored by observations of telomere sequences at 
evolutionary breakpoints in one American species (Marmosops 
incanus) with an ancestral 2n = 14 karyotype. 

New data showing that the Australian species are monophy- 
letic, but American species are not (Woodburne et al., 2003), 
and giving robust phylogenies for Australian groups (Amrine- 
Madsen et al., 2003) now necessitate a reevaluation of chromo¬ 
some change in marsupial evolution. We have used the higher 
resolution afforded by chromosome-specific paints derived 
from the exceptional 2n = 32 species of Aepyprymnus rufescens 
to characterize 19 evolutionary blocks shared by all marsupials 
and rearranged in different marsupial orders. 

We have particularly examined the question of whether par¬ 
ticular rearrangements may be reversed in evolution, or may 
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occur independently in different lineages. Fusions and fissions 
are well known in marsupials. For instance, karyotypic varia¬ 
tion of sub-family Macropodinae between low numbers (e.g. 
Wallabia bicolor , 2n = 10 female, 2n = 11 male) to high num¬ 
bers (Thylogale and Petrogale species, 2n = 22) is accounted for 
by evolution from a proposed macropodid ancestral 2n = 22 
karyotype (Eldridge et al., 1992; Eldridge and Close 1993; Glas 
et al., 1999; O’Neill et al., 1999). According to the current view 
of macropodid karyotype evolution, the chromosome number 
first increased from the 2n = 14 ancestral set to 2n = 22, then 
decreased to for instance 2n = 16 in Macropus eugenii , 2n = 14 
in Dendrolagus matschiei, 2n = 10 female, 2n = 11 male in Wal¬ 
labia bicolor. The rufous rat-kangaroo Aepyprymnus rufescens 
(2n = 32) is a member of the family Macropodidae. It has the 
highest chromosome number reported so far for a marsupial, 
and its karyotype has been regarded as the result of further fis¬ 
sion from a 2n = 22 karyotype (Hayman, 1990). 

It is of interest to investigate if this pattern of chromosome 
number variation encompasses recurrent fusions and fissions 
(i.e. convergence or reversal of earlier rearrangement events). 
Convergence is defined as shared derived similarities that are 
not based on a singular common origin, but on an independent 
origin in different taxa. Reversal is defined as the secondary 
presence of an apparently “primitive” character state, which is 
not homologous with the actual plesiomorphic state. 

In our study into this possibility, whole chromosome paints 
of Aepyprymnus rufescens made from flow-cytometry-sorted 
chromosomes were used for reciprocal chromosome painting 
(Ferguson-Smith, 1997). The advantage of these paints is that 
they create a chromosome homology map with a higher resolu¬ 
tion than used in our earlier studies (Rens et al., 1999, 2001). 
Chromosome homology studies reported previously were in¬ 
cluded in the analysis in order to increase the comparison to 
fifteen marsupial species, which represent five families in three 
orders (Kirsch et al., 1997; Amrine-Madsen et al., 2003). 

Materials and methods 

Fibroblast cell lines from Aepyprymnus rufescens, 2n = 32 (ARU), 
Monodelphis domestica, 2n = 18 (MDO), Macropus eugenii, 2n = 16 (MEU), 
Sminthopsis crassicaudata, 2n = 14 (SCR), Trichosurus vulpecula, 2n = 20 
(TVU) were established in the Dept, of Genetics, La Trobe University, Aus¬ 
tralia (McKay et al., 1992; Toder et al., 1997). A fibroblast cell culture from 
Didelphis marsupialis, 2n = 22 (DMA) was established in the Dept, of Genet¬ 
ics, University of Federal do Para, Belem, Brazil. Short-term fibroblast cul¬ 
tures were also obtained from M. domestica skin samples kindly provided by 
Prof. M.W.J. Ferguson, University of Manchester, UK. The Potorous tridac- 
tylus (PTI) cell line, 2n = 12 female, 13 male (PtKl cell line; Walen and 
Brown, 1962) was obtained from Dr. R. Moore, Peter McCallum Cancer 
Centre, Melbourne, Australia. For A. rufescens, cell culture, flow sorting, 
chromosome paint production and fluorescence in situ hybridization were 
performed according to the protocol described previously (Rens et al., 1999). 
Chromosome-specific paints for the other species were produced during ear¬ 
lier studies (Rens et al., 1999, 2001). 

Chromosome painting 

Unidirectional chromosome painting was performed between ARU, 
PTI, SCR, and DMA, with ARU as the source genome. Reciprocal chromo¬ 
some painting was used between ARU, MDO, MEU and TVU. Images were 
captured using the Leica QFISH software (Leica Microsystems) and a cooled 
CCD camera (Photometries Sensys) mounted on a Leica DMRXA micro¬ 
scope equipped with an automated filter wheel, DAPI and Cy3 specific filter 


sets and a 63*, 1.3 NA objective. Cy3 and DAPI signals were captured sepa¬ 
rately as 8-bit black and white images, normalized and merged to a 24-bit 
color image. Processing the DAPI image with a 5 x 5 high pass spatial filter 
and displaying the image in contrast-adjusted reverse video obtained en¬ 
hanced chromosome bands. Hybridization signals were assigned to specific 
chromosome regions defined by enhanced DAPI banding patterns through 
merging the DAPI-banded images with signal images. Image processing was 
performed with Leica CW4000 software or IPLab Spectrum software. 

GTG banding was performed using a modification of standard protocols. 
Slides aged 2-4 weeks, were immerged in 0.04% trypsin in Hank’s balanced 
salt solution for 2-3 min, rinsed 3 times in phosphate buffer (pH 6.8), stained 
with 2% Giemsa in phosphate buffer (pH 6.8) for 5 min, rinsed with water 
and air dried. Images were enhanced with Leica CW4000 software. 

Chromosome homology map 

Chromosome homologies are determined by chromosome painting and 
conserved regions are defined as those regions that are homologous in all 
species studied. 

Consulted reports 

The paper by Johnston et al. (1984) reports on karyotype comparison 
between Potorous longipes and Potorous tridactylus. Two wallabies are dis¬ 
cussed by Toder et al. (1997a). Glas et al. (1999) discuss the karyotype of 2n = 
22 species regarded as ancestral to karyotypes of species in the Macropodidae 
family; in addition the 2n = 14 karyotype of Dendrolagus matschiei is 
described. De Leo et al. (1999) investigated the relationships of 2n = 14 mar¬ 
supial karyotypes. Karyotypes of distantly related marsupials within Austra¬ 
lia but also in South America are described by Rens et al. (1999,2001). Svart- 
man and Vianna-Morgante (1998) considered karyotype evolution of South 
American marsupial species. 

From these reports data on four species (. Dendrolagus matschiei [DMT], 
Thylogale thetis [TTH], Potorous longipes [PLO], Lasiorhinus latifrons 
[LLA]), are included in this report. The classification of these species is as 
follows (see Fig. 1). Dendrolagus matschiei (2n = 14) and Thylogale thetis 
(2n = 22) belong to the Macropodidae family and they are grouped together 
with Macropus eugenii in the Macropodinae subfamily (Burk et al., 1998). 
Potorous longipes (2n = 24) belongs also to the Macropodidae family but is 
grouped with Potorous tridactylus in the subfamily Potoroinae (Burk et al., 
1998; Kirsch et al., 1997). All these species together with Lasiorhinus lati¬ 
frons and Trichosurus vulpecula belong to the order Diprotodontia (Kirsch et 
al., 1997). The other three species studied are in different orders. Sminthopsis 
crassicaudata is in the order Dasyuromorphia, family Dasyuridae. Monodel¬ 
phis domestica and Didelphis marsupialis are in the order Didelphimorphia 
(Kirsch et al., 1997). 


Results 

Chromosome paint production 

The flow karyotype of a male Aepyprymnus rufescens (2n = 
32) karyotype is shown in Fig. 2. Twelve peaks can be identi¬ 
fied. Chromosomes 1-5, 9, 10, 13, X, and Y are represented by 
separate peaks and uncontaminated paints were produced rea¬ 
dily. Unfortunately, a single peak represented chromosomes 6, 
7, 8, 14 and 15 and another single peak represents chromo¬ 
somes 11 and 12. Specific paints for these chromosomes were 
obtained by amplification and labeling of single sorted chromo¬ 
somes. ARU paints were used in this study to create a higher 
resolution homology map than the ones produced in our earlier 
studies. 

Chromosome homology in closely related species 

All of the 16 chromosomes were preserved intact on a single 
chromosome in other macropodid species. Figure 3 shows ex¬ 
amples of cross species chromosome painting. 
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14,18,22 Didelphimorphia Didelphidae 


14 


14 


14 


Paucituberculata 


Microbiotheria 


14 


14 


14 


14 
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Dasyuromorphia 


14 


Dasyuridae 


Diprotodontia 


14 


14 


Vombatidae 

Phalangeridae 


22 


Macropodidae 


22 


Hy psi pry mnodi nae 


22 Potoroinae 


22 


12-32 


22 


DMA -Dide Iph is marsupia/is 
M DO-Moncx/e/phis domestica 
SCR -Sminthopsis crassicaudatci 
L LA-Lasiorhinus la tifrom 
T VU- Trichosurus vu/peada 
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PLO -Poiorous Iongipes 
PTI -Potorous tridactylus; Sp-unknown species 
DMT -Dendrolagus matschiei; TTH -Thylogale that is 
PPE -Pe/rogale pearsoni; MRU -Macropus rufus 
MED-Macropus eugenii; WBI -Wa/labia hico/or 


Genus 

Sampled species 

Didelphis 

DMA 

22 

Monodelphis 

MDO 

18 


Macropodinae 


10-22 


H 




18,22. 

12-18 

L 

22 

22 ■ 


Bettongia 

Aepyprymnus 

Potorous 


22 


22 


Dorcopsis 

Dendrolagus 

Thylogale 
Petrogale 


Macropus 


SCR 


LLA 


PLO 

PTI 


Sp 

DMT 

TTH 
PPE etc 


MRU 

MEU 

WBI 


14 


14 


TVU 20 


ARU 32 


24 

12 


22 

14 

22 

22 

20 

16 

10 


Fig. 1. Phylogeny of the marsupial species studied. The numbers refer to the diploid number of chromosomes. 


ARU paints were first used to detect homology in karyo¬ 
types of its close relative, the potoroo (PTI), and tammar walla¬ 
by (MEU) (Figs. 3 and 4). All ARU chromosomes form con¬ 
served blocks in the karyotypes of PTI and MEU, reflecting the 
close relationship between these three species. For example, 
ARU chromosomes 2, 5 and 10 comprise the relative large 
chromosomes 1 of both MEU and PTIlq, only in a different 
order. Conserved regions C8-C9 form one region, to which Cl 
is fused at different sites, forming MEU lp and the terminus of 
PTIlq (Fig. 4). PTI1 (but not MEU1) contains the C4 region, 
which in MEU forms 4p (Fig. 3a, b). MEU4q is homologous to 
PTIXq. 

Chromosome homology in distantly related species 
Painting of more distantly related species (Australian spe¬ 
cies SCR, TVU and South American species DMA and MDO) 


resulted in the chromosome homology map presented in Fig. 4. 
Conserved regions (Cl-Cl9) noted on the right side of each 
chromosome remained intact between all species. For instance, 
C9 (the longest acrocentric in the ARU karyotype) labeled the 
long arm of MDO chromosome 1 (MDOlq), SCR2, and is con¬ 
served on a single chromosome in all species (Fig. 3c, d). 

The paint of C12 was created from a single (instead of 400) 
sorted ARU 11 chromosome, because this chromosome type 
did not form a peak on its own in the flow karyotype (ARU 11 
and 12 are together, see Fig. 1). The paint gave good results 
when hybridized to ARU (Fig. 3e), PTI, or MEU (Fig. 3f) meta¬ 
phases, but did not produce good signals when hybridized to 
MDO or DMA metaphases. Indirectly, C12 was assigned to 
MD07 using the following results: In MEU it forms the p-arm 
of chromosome 5. MEU5 is homologous to SCR3q. MD07 is 
homologous to SCR3p plus a region just under the centromere 
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Fig. 2. Flow karyotype of Aepyprymnus rufescens. Chromosomes 6-8,14, 
15 and 11, 12 are represented each by one peak, all other chromosomes are 
represented by an individual peak. 


Segments C7-9 (Fig. 5b). Three recurrent fusions/fissions 
are apparent. C9 is defined by a single chromosome in ARU, 
but is fused to C8 in all marsupial species except DMA and 
TVU. This means either that C9 forms a single chromosome in 
the ancestor, which fused with C8 and subsequently underwent 
fission in ARU and TVU, or that it was fused in the ancestor 
and underwent fission independently in DMA and ARU. C7 is 
fused with C8 in American marsupials and 2n = 14 Australian 
marsupials, but underwent fission in macropodids, then subse¬ 
quent fusion in DMA. 

Segments 10-12 (Fig. 5c). A fusion between CIO and Cl 1 
that occurred in 2n = 14 species has been reversed in the ances¬ 
tor of the phalangerids and macropodids, and re-fused in PTI 
and DMA. Alternatively, an ancestor had C10-11, and there 
was independent fission in the American and Australian co¬ 
horts. Cl 1 is fused to C12 in all species except in the American 
species and in ARU. This means either two independent fis¬ 
sions, or a fusion between Cl 1 and C12 in the Australian spe¬ 
cies followed by a fission in ARU. 

Segments Cl3-14 (Fig. 5d). These segments are together in 
all species (including ARU) except DMA and TVU, repre¬ 
senting independent recurrent fission of these segments in two 
distantly related species. 

Segments Cl5-16 (Fig. 5e). These segments are together in 
all species except ARU. 

Thus eight recurrent fusions or fissions can be documented 
by these painting results. 


(see Rens et al., 2001). This region is painted by the paint of 
ARU 11, i.e. Cl2. Important is that C12 is linked to CIO in 
MDO and not to Cl 1 as it is in all other species except ARU 
(see Discussion). Note also that the centromere of MEU5 is the 
fusion point of C12 and Cl 1 from MDO to the conserved com¬ 
plement. 

Again, thirteen of the 16 ARU chromosomes represent 
blocks that are conserved on a single chromosome in all marsu¬ 
pial species. Three ARU chromosomes consist each of two con¬ 
served regions (ARU1 = Cl 3/C 14; ARU7 = C2/C3; and 
ARU 15 = C7/C17). 

Recurrent fusions and fissions 

A number of homology results are described below, as they 
are important for the discussion on parallel and reversed chro¬ 
mosome rearrangements during marsupial chromosome evolu¬ 
tion. 

Segments Cl-6 (Fig. 5a, f). Two apparently recurrent fu¬ 
sions/fissions are detected. These segments are all joined in the 
2n = 14 karyotype thought to be ancestral, but are separated 
into 1-2-3 and 4-5-6 in both American species, and 1, 2-3, 4-5-6 
in macropodids (and 1, 2-3, 4, 5-6 in potoroids). Depending on 
whether the ancestral marsupial was more similar to the Ameri¬ 
can than the Australian species, this represents either a 3-4 
fusion between the American and Australian clades followed by 
a 3/4 fission in macropods, or two independent 3/4 fissions in 
American and Australian groups. A fusion between C2-3 and 
C18 has occurred apparently independently in PTR and DMT, 
and is not present in other potoroids or macropodids) (see 

Fig. 3g, h). 


Discussion 

Karyotype evolution in marsupials 

Recent clarification of marsupial phylogeny makes it possi¬ 
ble to offer a simple explanation of the role of fusions and fis¬ 
sions in marsupial karyotype evolution. Figure 1 shows the 
relationships of the major groups, as deduced from molecular 
and other characteristics (Amrine-Madsen et al., 2003, Wester- 
man et al., 2003) with chromosome numbers noted on each 
branch. It is now clear that Australian marsupials are mono- 
phyletic, but South American marsupials may not be (see also 
Phillips et al., 2001; Amrine-Madsen et al., 2003). The order 
Microbiotheria (2n = 14, Reig et al., 1972) is within Australi- 
delphia. Paucituberculata (five out of seven species have 2n = 
14, Hayman et al., 1971) is closer to Australidelphia than to the 
Didelphimorphia, which have species with 2n = 18 and 22, as 
well as 14. This favors the hypothesis that the ancestors of all 
the groups except didelphids were 2n = 14, but leaves open the 
question of whether the ancestor of all marsupials had 2n = 14, 
or a higher number, as proposed by Svartman and Vianna- 
Morgante (1998). It should be noted that FISH has not yet been 
performed on species from Microbiotheria or Paucitubercu¬ 
lata. 

Macropodid marsupials (kangaroos and wallabies), as well 
as some possums and gliders have a range of chromosome num¬ 
bers from 2n = 10-32. It has been proposed that a common 
ancestor of this group had a 2n = 22 karyotype, typified by Thy- 
logale and Petrogale species, derived by several fusions as 
described by De Leo et al. (1999). Independent fusions between 
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C4 on PTI C4 on MEU 




C9 on TVli 


C9 on ARU 



C2C3-C18 on MEU C2C3-C18 on PTI 


Fig. 3. Examples of cross-species chromosome paint¬ 
ing. Aepyprymnus rufescens chromosome 6 paint hybrid¬ 
ized to (a) Potorous tridactylus chromosome 1 and (b) 
Macropus eugenii chromosome 4. (c) Aepyprymnus rufes¬ 
cens chromosome 2 paint hybridized to Trichosurus vulpe- 
cula chromosome 4 and (d) Trichosurus vulpecula chromo¬ 
some 4 paint hybridized to Aepyprymnus rufescens chro¬ 
mosome 2. (e) Aepyprymnus rufescens chromosome 11 
paint hybridized to Aepyprymnus rufescens chromosome 
11 and (f) Macropus eugenii chromosome 5. (g) Aepyprym¬ 
nus rufescens chromosome 7 paint (red) and Aepyprymnus 
rufescens chromosome 14 paint (green) hybridized to Ma¬ 
cropus eugenii chromosome 3 and 6 and (h) Potorous tri¬ 
dactylus chromosome 4. 

Fig. 4. Comparative chromosome map of the seven 
marsupial species. All chromosomes are G-banded except 
ARU, which are DAPI banded. 19 conserved regions (Cl- 
09) are shown next to each chromosome. The arrows 
point to the centromeres. 
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Fig. 5. Karyotype phylogeny presented per chromosome of the 2n = 14 karyotype, (a) chromosome 1, segment 1-6, (b) chromo¬ 
some 2, segment 7-9, (c) chromosome 3, segment 10-12, (d) chromosome 4, segment 13-14, (e) chromosome 5, segment 15-16, 
(f) chromosome 1, segment 17-18. Fission is Fusion is Fusion or Fission is “O”. 
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different chromosomes gave rise to the 2n = 10, 14, 16, 18, 20 
and 22 karyotypes (Rofe and Hayman, 1985; Eldridge et al., 
1992; Eldridge and Close 1993; Glas et ah, 1999; O’Neill et al, 
1999), and further fissions occurred to produce the exceptional 
2n = 32 karyotype of A. rufescens. 

Cross-species painting using the paints afforded by A. 
rufescens flow sorted chromosomes has therefore provided a 
higher-resolution chromosome homology map than the one 
published previously (Fig. 4), and added the karyotypes of 
Aepyprymnus rufescens and Didelphis marsupialis. This map 
will be useful for the development of gene maps of these spe¬ 
cies. Our painting results concur with conclusions made from 
morphology (Hayman and Martin, 1974), and from banding 
studies (Rofe and Hayman, 1985), that there has been limited 
chromosome rearrangement throughout the entire infraclass 
Marsupialia. All the diversity of karyotypes can be accounted 
for by the rearrangement of just 19 evolutionary blocks. 

Recurrent events in marsupial karyotype evolution 

A change in chromosome structure between species can be 
regarded as a rare genomic event and a useful parameter for 
phylogenetics (Rokas and Holland, 2000), but only when these 
changes are not prone to extensive convergence or parallel evo¬ 
lution. 

The low numbers of chromosomes, and the extremely con¬ 
served karyotypes of marsupials make this group ideal for 
examining convergence and parallel evolution. Eight clear ex¬ 
amples have been documented; four of convergence by serial 
fission/fusion, and four of apparently the same fusion, or fis¬ 
sion, occurring in independent lineages. 

It is possible that some of these events could be artifacts of 
incorrect phylogenetic relationships. For instance, the fusions 
(Cl0-11, 12 and C2, 3-18) are both common to the two species 
with reduced chromosome numbers, Dendrolagus matschiei 
and Potorous tridactylus. These convergences disappear if these 
two species are classified together, e.g. Dendrolagus matschiei is 
placed within subfamily Potoroinae. However, the tree kanga¬ 
roos (genus Dendrolagus) are traditionally considered to be 
within the subfamily Macropodinae on many characters 
(Kirsch et al., 1997; Burk et al., 1998; Colgan, 1999), although 
all have low bootstrap confidence limits. In any case, a reclassif¬ 
ication of D. matschiei introduces a new convergence between 
C4 and C5-6. 

Repeated fission events are impossible to account for by 
classification errors: for instance, the 8/9 fission occurred in the 
American marsupial D. marsupialis , and again in the distantly 
related phalangerid marsupials of Australia. Likewise, the 13/ 
14 fission occurred in D. marsupialis, and again in T. vul- 
pecula. 

Reversal events are also difficult to avoid by reclassifying 
species. First consider the reversal of C10-C11-12. If the ances¬ 
tral marsupial had these blocks on different chromosomes, as 
suggested by Svartman and Vianna-Morgante (1998), a fusion 
must have taken place in the 2n = 14 lineage that led to the 
Australian marsupials. This was then reversed by a fission 
occurring in an ancestral phalangerid, since these blocks are 
separate in Trichosurus vulpecula , as well as in the Macropodin 
ancestral karyotype (represented by Thylogale thetis). These 


two blocks fused once more independently in P. tridactylus and 
D. matschiei , as discussed above. 

Even more difficult to account for are reversals that occur in 
widely separate taxa. For instance, the fusion of Cl 1-12, and 
between C8 and C9 that occurred in 2n = 14 species is reversed 
only in A. rufescens (and TVU for C8, C9) among Australian 
species. Likewise, the fusion of C3-4 that occurred in 2n = 14 
species is reversed in family Macropodidae only. 

The question of the original ancestral marsupial karyotype, 
which cannot be resolved without reference to an outgroup, 
does not affect these arguments. In the above arguments, we 
have assumed that the 2n = 22 didelphid species do indeed 
reflect an ancestor with a higher chromosome number, as sug¬ 
gested by the interstitial telomeres of one of the 2n = 14 didel- 
phids (Svartman and Vianna-Morgante, 1998). However, if the 
original ancestor had 2n = 14, it would still be necessary to 
score parallel rearrangements, and reversals. For instance, the 
fused Cl0-11 would then be considered ancestral, but indepen¬ 
dent fissions would then have to be postulated in the didelphids 
and phalangerids. Likewise, an originally fused C3-4 would 
have undergone independent fission in didelphids and macro- 
podids. 

Breakpoints 

Reversals and convergence were therefore found in relative¬ 
ly high proportion of the few chromosome rearrangements dur¬ 
ing the karyotype evolution of the marsupial species studied. 

The repeated involvement of the same two chromosomes in 
rearrangements might be due to the relatively few chromo¬ 
somes in marsupial karyotypes, and thus a higher chance that 
the same chromosomes are involved. However, five out of the 
six reversals involve fusion/fission at the same breakpoint, 
within the resolution of the chromosome painting method. The 
sensitivity of chromosome painting to detect even small regions 
left by a fission close to a previous fusion point can be seen in 
the investigation of the copoint sex chromosomes of the 2n = 
10, 11 macropodid Wallabia bicolor (Toder et al., 1997a). An 
XY 1 Y 2 system that had been ascribed to the fusion of the Y 
with an autosome was revealed to have resulted from fusion of 
the autosome with both the X and Y, followed by fission near 
but not at the original fusion point. 

Non-random association of telocentric chromosomes, and 
chromosome arms might be understandable if all the fusions and 
fissions were Robertsonian rearrangements, involving centric 
fusion and fission. However, only for two of these cases can 
reversal be explained by centric fusion of two acrocentric chro¬ 
mosomes followed by centric fission at the fused centromere. For 
the other three, the co-localization of the fusion/fission site sug¬ 
gests that the original fusion event leaves a “mark” that is vulner¬ 
able to subsequent fission. For instance, fusion of telomere 
sequences may leave repetitive sequence at which breakage may 
occur preferentially. Dehal et al. (2001) describe high concentra¬ 
tion of LI repeats and retrovirus-associated LTR sequences at 
sites of evolutionary breaks when human chromosome 19 was 
compared with mouse chromosomes. Breakpoints in mouse 
chromosomes are also related to arrays of zinc-finger genes or OR 
genes. Although in a slightly different context, Pevzner and Tes- 
ler (2003) also discuss the reuse of breakpoints. 
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Chromosome rearrangement as a phylogenetic tool 
The surprising number of reversal and convergence events 
in marsupial karyotype evolution makes it difficult to use chro¬ 
mosome rearrangements effectively to define relationships of 
species, as briefly discussed by Rens et al. (2001). One could 
state that this is due to the lack of character definition: there are 
only 19 total homology blocks. Avoiding reversals or conver¬ 
gences is then impossible due to the combination of low chro¬ 
mosome number and relatively high number of fusions and fis¬ 
sions. However, the statement of “lack of character definition” 
assumes that marsupials only have a choice of 19 blocks to 
“create” their karyotypes, which first might not be true and sec¬ 
ond, in itself states that the breakpoints are fixed. 


With so many apparent reversals, evolutionary chromo¬ 
some rearrangements prove to be a disappointingly imprecise 
indicator for phylogenetic relationships among marsupials. 
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Chicken genome sequence: a centennial gift to 
poultry genetics 
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Abstract. A draft sequence of the chicken genome will be 
available by early 2004. This event conveniently marks the 
start of the second century of poultry genetics, coming 100 
years after the use of the chicken to demonstrate Mendelian 
inheritance in animals by William Bateson. How will the sec¬ 
ond, post-genomic century of poultry genetics differ from the 
first? A whole genome shotgun (WGS) approach is being used 
to obtain the chicken sequence, with the goal of generating 
approximately six-fold coverage of the genome. Bacterial artifi¬ 
cial chromosome (BAC) and fosmid clone end sequences, along 
with a BAC contig map integrated with genetic linkage and 


radiation hybrid maps, will form the platform for assembly of 
the WGS data. Rapid progress in global analysis of chicken 
gene expression patterns is also being made. Comparative 
genomics will link these new discoveries to the knowledge base 
for all other animal species. It’s hoped that the genome 
sequence will also provide common ground on which to unite 
studies of the chicken as a model species with those aimed at 
agriculturally-relevant applications. The current status of 
chicken genomics will be assessed with projections for its near 
and long term future. 

Copyright©2003 S. Karger AG, Basel 


From Punnett square to genome sequence 

The year 2002 marked the 100 th anniversary of the publica¬ 
tion by William Bateson of “Experiments with poultry” (Bate¬ 
son, 1902), the first scientific paper on poultry genetics (accord¬ 
ing to Warren, 1958) in which the dominant white feather color 
trait of White Leghorns was examined (Smyth, 1990). Bateson 
was among those who rediscovered the work of Mendel, and he 
and his successor at Cambridge, R.C. Punnett, went on to do 
seminal research, not only in poultry, but for all of classical 
genetics (e.g., Bateson and Punnett, 1905, 1906). Punnett’s 
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name, of course, lives on in the “Punnett square”, that ubiqui¬ 
tous tool of elementary genetics. There is neither space here, 
nor is this the place, to review the past century of poultry genet¬ 
ics. The focus will be on “genomics” and the future. 

Chicken: An aging, but still attractive, model 

From the days of Bateson and Punnett, the chicken has been 
a valuable “model organism” for genetic studies, in addition to 
its obvious importance as a major commodity for animal agri¬ 
culture. Even Thomas Hunt Morgan studied sex-linked inheri¬ 
tance in poultry (Morgan and Goodale, 1912) before he turned 
to that less palatable, but more fecund winged creature, Dro¬ 
sophila melanogaster. Since the money that drives genome 
sequencing, at least in the US, comes mainly from the National 
Human Genome Research Institute (NHGRI), it was exactly 
this importance as a model species that provided the impetus 
for sequencing the chicken genome. The value of the chicken 
for research in virology, developmental biology, oncology, im¬ 
munology and disease was emphasized in the white paper pro¬ 
posal to NHGRI, leading to the chicken genome being awarded 
“high priority” for full genome sequence analysis (see http:// 
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Table 1 . Current status - at a glance 


Status of chicken genomics 

June, 2003 

Projected 

Physical map: BAC contigs 

WGS sequence coverage 

Finished genome sequence 
Reference linkage map 

Radiation hybrid mapping 

Chicken ESTs 

Chicken arrays 

Chicken-human comparative map 

2331 contigs, 650 kb average size 3 

— 1X 

for selected BACs and gene regions only 
-2000 markers* 5 

RH panel available 0 

418,093 (dbEST), -600,000 (total) 

<4000 elements/<3000 genes d 

whole genome, low resolution* 5 

specific chromosomes, moderate resolution 6 

<300 contigs, 2003-2004 
>6x, November, 2003 
unknown 

>20,000 markers, post-sequence 
framework RH map, 2004 
one million (total), 2004 

14,000 element array, late-2003 
whole genome comparative 
sequence analysis, 2003-2004 

a Ren et al., in press. 

b Groenen et al. (2000); Schmid et al. (2000). 
c Morisson et al. (2002). 

d Neiman et al. (2001,2003); Liu et al. (2001); Min et al. (2003). 
e Buitenhuis et al. (2002); Jennen et al. (2002). 


www.genome.gov/page.cfm?pageID=10002154). More impor¬ 
tant, this was also a primary factor in persuading the Washing¬ 
ton University Genome Sequencing Center (WUGSC) to ini¬ 
tiate its chicken genome sequencing project (http://genome. 
wustl.edu/projects/chicken/). Brown et al. (2003) provide a 
recent update of the case for the chicken as a model organism, 
with particular focus on applications in developmental biology 
and functional genomics. These authors point out that even the 
existing incomplete collection of chicken genes contains homo- 
logues for 90% of the human inherited disease genes in the 
Online Mendelian Inheritance in Man “morbid map” (http:// 
www.ncbi.nlm.nih.gov:80/htbin-post/Omim/getmorbid). Fur¬ 
thermore, sequence analysis of selected gene clusters has 
shown that the chicken genome provides a very informative 
comparison to mouse and human genomes to aid in the anno¬ 
tation of exons and conserved regulatory domains (Gottgens et 
al., 2000; Margulies and Green, 2003). Finally, when it comes 
to population traits, chickens are probably the most populous 
species of any domesticated animal (assuming the housefly 
doesn’t qualify). While it’s clear that the chicken will never 
replace the mouse as the primary model experimental system 
for human biology, it can certainly compete with most other 
vertebrates both in terms of past contributions and future 
potential. 

Current status of chicken genomics 

Physical map of the chicken genome 

The (haploid) chicken genome exists as approximately 1.2 x 
10 9 base pairs of DNA organized into 38 autosomes and two 
sex chromosomes, Z and W. The eight largest autosomes and 
the sex chromosomes can be distinguished in stained mitotic 
spreads and are often designated macrochromosomes, while 
the remaining autosomes are microchromosomes. Almost all of 
the microchromosomes can now be aligned with genetic linkage 
groups or, at the least, identified by FISH hybridization to 
cloned sequences (Schmid et al., 2000). A more useful physical 
map is one in which most, if not all, of the genome can be 
assigned to collections of contiguous overlapping recombinant 
DNA inserts carried in bacterial artificial chromosome (BAC) 


vectors, so-called “BAC contig” maps. Crooijmans et al. (2000) 
prepared a 5.5x BAC library with inserts of //mdlll-digested 
DNA from a White Leghorn (WL) chicken. A larger BAC libra¬ 
ry (~ 15x) composed of three sub-libraries of BamHl, EcoRl 
and Hindlll partially digested DNA, respectively, from an 
inbred (UCD001) Red Jungle Fowl has also been described 
(Lee et al., 2003). A second UCD001 BAC library, CHORI-261 
(Children’s Hospital of Oakland Research Institute, Oakland, 
CA, http://bacpac.chori.org/chicken261.htm), designed to con¬ 
tain especially large inserts (average size, 195 kb) is also now 
available. Both local and genome-wide chicken BAC contig 
maps recently have been assembled (Crooijmans et al., 2003; 
Ren et al., in press) based on BAC fingerprint analysis, chromo¬ 
some walking and identification of BACs that contain specific 
markers and genes. The Ren et al. map consists of 2,331 contigs 
that average about 650 kb in size. The map covers about 1.3x 
the haploid chicken genome, suggesting that many of these con- 
tigs can, in fact, be merged, but statistically significant overlap 
has yet to be detected by fingerprint analysis alone. Efforts con¬ 
tinue in several labs to improve the BAC contig map by com¬ 
bining data from all three BAC libraries described above to 
merge overlaps and generate a physical map that provides full 
coverage with the minimum number of contigs. BAC contigs 
form the platform on which full genome sequences generally 
are assembled (Green, 2001), and they are also the bridge 
between the genome sequence and the linkage map essential for 
quantitative trait locus (QTL) analysis (Romanov et al., 2003). 

Complete genome sequence 

The ultimate physical map is the complete sequence of the 
chicken genome. As noted above, NHGRI has provided sup¬ 
port for a draft sequence, and the WUGSC is now in the pro¬ 
cess of generating the sequence data. The aim is to generate 6x 
whole genome shotgun (WGS) coverage of the genome. The 
approximately 10 million random reads will be assembled 
using low sequence coverage of selected BAC clones and end 
sequences of BAC and fosmid clones (Green, 2001; Burt and 
Pourquie, 2003), along with the BAC contig maps described 
above. WGS sequencing and fosmid libraries have been con¬ 
structed with DNA from the same UCD001 hen employed for 
the Lee et al. (2003) and CHORI-261 BAC libraries. As of 
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October, 2003, the National Center for Biotechnology Informa¬ 
tion (NCBI) Trace Archive (http://www.ncbi.nlm.nih.gov/ 
Traces/) lists 9,849,000 chicken accessions, mostly from 
WUGSC but some from sequencing of selected regions by the 
NIH Intramural Sequencing Center. Together, nearly 6 x cover¬ 
age of the chicken haploid genome has now been deposited. It’s 
projected that the WGS collection phase of the project will be 
complete by November, 2003. 

Radiation hybrid map 

Radiation hybrid (RH) maps have been a powerful adjunct 
in bridging the gap between genetic linkage maps and BAC con- 
tig maps for many species. RH mapping of the chicken genome 
has lagged due to difficulties in constructing RH cell panels that 
retain suitable levels of chicken genome fragments with ade¬ 
quate stability. Morisson et al. (2002) finally described a useful 
chicken RH panel last year. This group and their colleagues are 
presently working to map a substantial number of markers on 
the RH panel, to generate a framework RH map that can be 
used to locate and order any new chicken gene or sequence of 
interest. Like the BAC contig map, the RH map will provide an 
independent platform to assist the chicken genome sequence 
assembly process. 

Genetic linkage map 

The consensus chicken reference linkage map (Groenen et 
al., 2000; Schmid et al., 2000) consists of over 2,000 markers 
mapped with varying precision in one or more of three separate 
crosses. The map is fairly inclusive (i.e., any new marker has a 
>95 % chance of being linked to an existing marker), and it pro¬ 
vides a suitable tool for low resolution (10-20 cM) QTL 
searches (e.g., Vallejo et al., 1998; van Kaam et al., 1999a, b; 
Yonash et al., 1999). However, there remain several regions of 
poor marker density and the number of polymorphic microsa¬ 
tellites that can be used in many different populations is still 
insufficient. The genome sequence will provide the basis for a 
high resolution linkage map, as it will provide many more 
sequenced microsatellite loci. Primmer et al. (1997) estimate 
40-50,000 di-, tri- or tetranucleotide microsatellites in the 
chicken genome. As an example, already the WGS data in the 
Trace Archive contain 1,528 perfect matches to just one single 
microsatellite sequence, (AC)g, that could be exploited to devel¬ 
op new markers. Moreover, the UCD001 genome being se¬ 
quenced has been shown to differ by about 1 % from typical WL 
line genome sequences (Okimoto et al., 1997; Suchyta et al., 
2001). Thus, comparison of WGS data to existing chicken 
sequences (mostly WL) will generate a dense collection of single 
nucleotide polymorphisms (SNP). It remains uncertain how 
many of these SNP will be widely polymorphic in commercial 
lines, but haplotype sets of closely linked SNP should provide 
alternative highly polymorphic markers for QTL mapping and 
marker assisted selection. Thus, one outcome of the genome 
sequence project will be the creation of a much higher resolu¬ 
tion and much more versatile genetic linkage map. This will 
remain an essential tool for the analysis of new phenotypic vari¬ 
ation and for fine structure mapping efforts to identify the 
molecular basis for QTL allele effects. 


Gene expression 

Chicken expressed sequence tag (EST) accessions have grown 
dramatically in the last two years. As of October, 2003, the NCBI 
dbEST (http://www.ncbi.nlm.nih.gov/dbEST/) listed 451,565 
chicken entries, for sixth place on the species dbEST list. A large 
fraction of these comes from Boardman et al. (2002). These 
authors were able to assemble 323,670 of their sequences into 
85,486 EST contigs (46,674 of these were singletons), and they 
used the data to estimate that chickens express about 35,000 
genes, similar to earlier estimates for humans and mice. They 
also estimate that 20% of these EST contigs represent full length 
cDNAs, so a substantial portion of the overall chicken gene set 
has been fully sequenced, already. A number of other chicken 
EST projects are underway (e.g., U. of Delaware, http://www. 
chickest.udel.edu/; the German National Research Center for 
Environment and Health, GSF, http://swallow.gsf.de/DT40/ 
dt40Est.html), and it’s estimated that there are now over 
600,000 chicken EST among the various labs (Burt and Pour- 
quie, 2003). It’s not unreasonable to expect this number to climb 
to one million or more in the not-too-distant future. 

Chicken EST contigs can form the basis for the construction 
of microarrays, used in global profiling of gene expression. 
Chicken gene microarrays have been used to analyze changes in 
gene expression in response to infection with Marek’s disease 
virus (Liu et al., 2001) and Eimeria (Min et al., 2003) and in 
neoplasias induced by c -myc and c -myb over-expression (Nei- 
man et al., 2001; 2003). A variety of other microarray-based 
studies are underway. By late-2003, it’s expected that a 14,000 
transcript chicken microarray (glass slide) will be available 
from the Fred Hutchinson Cancer Research Facility (P. Nei- 
man and J. Burnside, personal communication). Once the 
genome sequence has been completed, it is reasonable to antici¬ 
pate that arrays will be developed containing elements that 
query the expression of all the estimated 35,000 chicken genes. 
However, it’s important to note that the functions of most 
chicken genes will remain hypothetical, even after we have a 
sequence. Furthermore, phenotypic variation (e.g., QTL) may 
derive from differences in protein function at least as often as 
they derive from differences in mRNA expression, and mi¬ 
croarrays detect only the latter. 

Comparative genomics 

An initially surprising level of conserved synteny and, to a 
lesser extent, gene order has been observed between the chicken 
and human genomes (Burt et al., 1999; Groenen et al., 2000; 
Waddington et al., 2000; Suchyta et al., 2001). It’s inevitable 
that as improved genetic maps and, especially, physical maps 
are obtained, the number of rearrangements that can be 
detected between these two genomes is increasing rapidly (Jen- 
nen et al., 2002; Buitenhuis et al., 2002). Still, these and other 
comparative maps will be immensely valuable in annotating 
the chicken genome and assigning likely functions to chicken 
genes, based on known functions of homologous genes in other 
species. As additional large sequence contigs from the chicken 
genome become available, it’s clear that direct sequence to 
sequence comparison between chicken and human genomes 
reveals a very high density of intergenic and intragenic discon¬ 
tinuities (Gottgens et al., 2000; Margulies and Green, 2003). 
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However, many of these discontinuities likely will be retrotran- 
sposon insertion/deletion events that don’t preclude useful 
comparisons being made in terms of common gene order, func¬ 
tion, and regulation of expression. 

Transgenic chickens 

Brown et al. (2003) review the many tools now available for 
analysis of gene function in chicken cell lines and embryos. 
Most of these involve manipulation of genes in various chicken 
somatic tissues, and these have proven very useful to the field 
of developmental biology, in particular. Unfortunately, it re¬ 
mains difficult and expensive to generate chicken lines that sta¬ 
bly inherit an exogenous transgenic sequence in a Mendelian 
fashion. The one proven method involves retroviral vectors 
(Salter et al., 1987), and, even in the best cases, this method is 
inefficient, costly and limited in scope. Many other approaches 
(Ivarie, 2003) have shown promise but remain in develop¬ 
mental stages. Ironically, the chicken DT40 cell line has proven 
to be the optimal vertebrate system for genetic manipulation of 
chromosomes in cultured cells due to its high rate of homolo¬ 
gous recombination (Buerstedde and Takeda, 1991). However, 
useful analogues of the embryonic stem cell and nuclear trans¬ 
fer techniques now widely used in mammalian transgenesis 
have yet to be fully established for the chicken. An effective and 
affordable method to create “knock-out” and “knock-in” germ 
line transgenics remains the single greatest need for continued 
exploitation of the chicken as a leading model organism. 

The future? 

Will we ever finish? 

Existing funding by NIH-NHGRI will likely suffice for the 
completion of a high quality draft sequence of the chicken 
genome. Due to a significantly lower rate of interspersed repeti¬ 
tive elements, this draft chicken sequence will probably be 
somewhat more accurate than the draft human genome se¬ 
quences that were originally published two years ago (Interna¬ 
tional Human Genome Sequencing Consortium, 2001; Venter 
et al., 2001). Comparisons between these two sequences dem¬ 
onstrated numerous discrepancies between the two (and be¬ 
tween each of them and independent maps), presumably due to 
errors in both assemblies (Venter et al., 2001). In the past two 
years, considerable effort has been expended to “finish” the 
human sequence, resolving and minimizing any gaps and other 
errors. WUGSC has expressed an interest in finishing the 
chicken genome sequence, and funds have been requested 
(from USDA) for this purpose. The finishing process presum¬ 
ably would be much cheaper for the smaller, less repetitive 
chicken genome in comparison to the human or mouse ge¬ 
nomes, and as the only avian genome likely to be sequenced in 
the near future, a finished quality sequence might be especially 
valuable for the chicken. However, it’s still fair to ask whether 
the limited funds available (especially at USDA) would be bet¬ 
ter used for draft sequencing other species’ genomes as opposed 
to finishing that of the chicken. The outlook is presently uncer¬ 
tain. Regardless, it is clear that a wealth of collateral data, 
including high quality BAC contig maps thoroughly integrated 


with linkage and RH maps, improved comparative maps and 
further marker development will all be essential in order to 
assemble the WGS sequence data into a reliable and informa¬ 
tive resource. 

Annotation and bioinformatics 

For a genome sequence to be of real value it needs to be 
“annotated”, i.e., genes and other important elements (regula¬ 
tory regions, transposable elements, pseudogenes, etc.) must be 
delineated within the sequence. In the cases of some model 
organisms, the annotation process has been done as a large 
group effort among the relevant community of scientists. How¬ 
ever, even adding together the existing communities of agricul¬ 
tural and model organism-based chicken geneticists would pro¬ 
vide a much smaller group than for most species whose 
genomes have been sequenced to date (except for seasquirts 
and pufferfish). A chicken genome consortium of over 40 scien¬ 
tists has come together to begin to consider this task (Burt and 
Pourquie, 2003), but the bulk of the annotation process will 
surely be automated using one or more of the variety of com¬ 
puter programs designed for this process. These programs all 
have significant error rates, but the growing and substantial 
EST data for the chicken will provide a valuable adjunct to 
improve the accuracy at least of exon/intron identification and 
alternative splicing analyses. 

Interest in providing bioinformatics support for the chicken 
genome sequence has already been expressed by the EN- 
SEMBL (http://www.ensembl.org) consortium of the European 
Bioinformatics Institute (Burt and Pourquie, 2003) and by 
NCBI (http://www.ncbi.nlm.nih.gov; see Hamernik and Adel- 
son, 2003). It seems likely that one or both of these highly qual¬ 
ified groups will provide the basic support necessary for general 
use of the chicken genome sequence. It’s also likely that several 
other groups will be involved in extracting and providing infor¬ 
mation that’s of special interest to subsets of the overall user 
community. ArkDB at the Roslin Institute (http://www.ri. 
bbsrc.ac.uk/chickmap), the WUGSC (http://genome.wustl. 
edu/projects/chicken/), Genefinder Genomic Resources at 
Texas A&M (http://hbz.tamu.edu/), the U. of Delaware (http:// 
www.chickest.udel.edu/), the GSF DT40 site (http://swallow. 
gsf.de/dt40.html/), the UMIST Chicken EST site (http://www. 
chick.umist.ac.uk/), Wageningen U. (http://www.zod.wau.nl/ 
vf/), and the USDA National Animal Genome Research Pro¬ 
gram Poultry Genome Mapping page (http://poultry.mph.msu. 
edu/), among others, will continue to provide special purpose 
data and services. For a more detailed discussion of the needs 
for future bioinformatics support, see Hamernik and Adelson 
(2003). 

The specter of a sequence: moon shot or gold mine? 

What does the genome sequence mean to the future of poul¬ 
try genetics? Will it provide a gold mine of information for 
breeders and geneticists that will revolutionize the way they do 
business? Or will it be more analogous to sending a man to the 
moon: an impressive and, perhaps, mind-altering achievement, 
but one with so little impact that a generation has gone by with¬ 
out trying it again. Reasonable arguments can be made on both 
sides. 
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Certainly, the range of hypotheses that will be open to fur¬ 
ther investigation will expand enormously. Instead of, perhaps, 
one or two thousand characterized chicken genes, we’ll have 
35,000. Will the community of poultry geneticists and the 
resources available to them expand similarly? It will surely be 
possible to measure expression levels of most of these genes 
(even if we don’t know what they do) in many different tissues 
at a variety of developmental times and following a variety of 
treatments. What will we make of 20-50 million data points 
(e.g., 20,000 genes in 1,000-5,000 samples, a conservative esti¬ 
mate)? Can we assemble these data into some higher order 
understanding of poultry biology, like the eye assembles pixels 
on a computer monitor into a comprehensible image not appar¬ 
ent in any small subset of the pixels themselves? Arbeitman et 
al. (2002) recently published a paper (with nearly 12 Mb of sup¬ 
plemental text, tables and figures) that measured gene expres¬ 
sion for about one third of all Drosophila melanogaster genes 
throughout its entire life history (reared in the lab). Certainly 
this is an impressive accomplishment, and the authors were 
able to extract several general conclusions of interest out of the 
blizzard of data, but how does this compare, for example, to the 
isolation of bithorax mutants by Ed Lewis (Lewis, 1978), a sem¬ 
inal achievement later recognized with the Nobel Prize? Genes 
of the bithorax complex are included among the 176 page list¬ 
ing of developmental^ regulated Drosophila genes by Arbeit¬ 
man et al. (2002), but would their central role have been identi¬ 
fied de novo? Perhaps we will find out from studies of the 
chicken, which rivals Drosophila in length of use as a model 
species, but not in the number of mutants already analyzed. 

Nevertheless, genome sequences and global expression data 
have changed the way both Drosophila and human genetics are 


done, forever, and we can expect the same for the chicken and 
other agricultural animals. Perhaps most important, gene ho¬ 
mologies have broadened the context of poultry genetics, allow¬ 
ing one to examine and utilize all the data on a gene of interest, 
regardless of the species in which the studies first were per¬ 
formed. Genomics offers the prospect of truly connecting poul¬ 
try science, not just to all of animal science, but to all of biology. 
Hopefully, it will also bring together the rather disparate com¬ 
munities of those interested in the chicken as a food animal and 
those interested in it as a model organism. These two groups 
have much to learn from one another, and the genome 
sequence should provide the common ground on which they’ll 
meet. It seems less likely that the genome sequence will have a 
major short term impact on the practice of commercial poultry 
breeding. It will take some time for the poultry science commu¬ 
nity to digest the massive influx of new genomic data and distill 
the specific applications from which breeders can benefit. This 
is the process we’re already seeing in the slow, but steady, con¬ 
version of human genome data to new medicines and therapies. 
Since the poultry breeding industry has already produced a 
very economical product using quantitative genetics, cost- 
effective applications of genomic technology in poultry will be 
even harder to come by than for other species, but no less inevi¬ 
table in the long run. 
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Abstract. Different genomic resources in chicken were inte¬ 
grated through the Wageningen chicken BAC library. First, a 
BAC anchor map was created by screening this library with two 
sets of markers: microsatellite markers from the consensus link¬ 
age map and markers created from BAC end sequencing in 
chromosome walking experiments. Second, Hindlll digestion 


fingerprints were created for all BACs of the Wageningen chick¬ 
en BAC library. Third, cytogenetic positions of BACs were 
assigned by FISH. These integrated resources will facilitate fur¬ 
ther chromosome-walking experiments and whole-genome se¬ 
quencing. 

Copyright©2003 S. Karger AG, Basel 


Chicken (Gallus gallus , GGA) has a long history as a model 
organism for developmental biology, immunology and micro¬ 
biology in vertebrates (Brown et al., 2003). In recent years, 
much effort has been made to create different genome mapping 
resources in this animal. A standardized karyotype of chicken 
was published by the International System for Standardized 
Avian Karyotypes in 1999 (Ladjali-Mohammedi et al., 1999). 
The detailed consensus linkage map of the chicken genome pro¬ 
vides a large set of markers (n = 2,012; Groenen and Crooij- 
mans, 2003), approximately one every 2 cM, that can be used in 
QTL studies for either whole-genome or regional scans. Schmid 
et al. (2000) published a first rough outline of chicken-human 
and chicken-mouse comparative maps. The comparative map 
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provides an efficient way to identify relevant genes in live¬ 
stock, based on the mapping information of species with more 
detailed maps such as human and mouse. However, the resolu¬ 
tion of these comparative maps in chicken remains low be¬ 
cause of a high number of inter- and intrachromosomal rear¬ 
rangements between chicken and mammals (Crooijmans et al., 
2001). Recently, Morisson et al. (2002) have created Chick- 
RH6, a chicken whole-genome radiation hybrid panel that con¬ 
sists of 90 hybrid clones. This panel makes it possible to map 
markers by simple PCR, avoiding development of polymor¬ 
phic markers as is required for genetic mapping. Finally, a 
detailed physical map - based on overlapping large insert 
clones such as bacterial artificial chromosomes (BACs) - is 
necessary to facilitate whole-genome sequencing (Gregory et 
al., 2002). The physical map is also an important resource for 
fluorescent in situ hybridization (FISH) studies. Although 
chromosome-walking experiments have resulted in parts of the 
physical maps for chicken chromosomes 8, 10, 13, 15, 24 and 
28 (Crooijmans et al., 2001; Buitenhuis et al., 2002; Jennen et 
al., 2002), most chromosomal regions in chicken lack identi¬ 
fied BAC clones. 

Integration of the above resources is necessary to facilitate 
whole-genome sequencing of chicken, planned in the second 
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and third quarter of 2003. The availability of the chicken DNA 
sequence will not only boost research in birds, but will also aid 
in the further detailed annotation of the human genome 
sequence. Human-mouse sequence comparisons using for ex¬ 
ample PipMaker (Frazer et al. 2003) show high similarity not 
only for coding and regulatory regions, but also for large parts 
of the “junk”-DNA. In contrast, preliminary results show that 
the evolutionary distance between human and chicken offers a 
very good signal-to-noise ratio in large-scale sequence compari¬ 
sons for confirmation or negation of hypothetical genes, to 
highlight novel genes and for the identification of regulatory 
elements. 

The research presented in this paper lays the foundation for 
the integration of different genomic resources in chicken 
through the Wageningen BAC library. First, a BAC anchor map 
is created to link BACs to the genetic map. Second, the com¬ 
plete BAC library is fingerprinted by Hindlll digestion to be 
included in contig building to create the complete genome 
physical map. Third, using FISH mapping, cytogenetic posi¬ 
tions are assigned to specific BACs. Fourth, BAC end sequenc¬ 
ing is in progress to allow anchoring of shotgun-sequencing con- 
tigs to the physical map of chicken. 


Materials and methods 

BAC library 

This project used the Wageningen chicken BAC library, consisting of 
50,208 BAC clones (Crooijmans et al., 2000). The clones have a reported 
average insert size of 134 kb, representing a 5.6* coverage of the chicken 
genome. 

The BAC library is stored in 130 384-well plates. Row-, column- and 
platepools are created for each plate to enable PCR screening of the library. 

BAC anchor map 

The BAC library was screened with two sets of markers: STS (Sequence 
Tagged Site) markers located on the chicken linkage map and STSs created in 
chromosome walking experiments. 

The first set was based on the extensive chicken consensus linkage map as 
published by Groenen et al. (2000) and updated by Schmid et al. (2000). The 
total number of microsatellite markers on this map is 1,255. For 37 of these, 
at least one positive BAC was already identified in previous work (Crooij¬ 
mans et al., 2001; Buitenhuis et al., 2002; Jennen et al., 2002, in review). The 
remaining 1,218 microsatellite markers were used for an exhaustive screen¬ 
ing of the Wageningen BAC library by PCR. 

The second set of markers consisted of 853 STSs generated by BAC end 
sequencing in chromosome walking experiments. The genetic position of 
these markers is known indirectly, because walking experiments started with 
markers on the genetic map. 

A two-dimensional screening was performed, as described by Crooij¬ 
mans et al. (2000). In a first step, the plate pools were screened to identify the 
plates that are positive for the marker. In a second step, the row- and column- 
pools of these plates were screened to find the coordinates of the positive 
BAC clone. If this resulted in spurious or ambiguous results (i.e. weak signal 
or multiple positive rows/columns), the single BAC clone was tested. To 
obtain single BAC clone DNA, clones were grown overnight in LB medium 
with 12.5 mg/ml chloramphenicol at 37 °C. 5 pi cell suspension was diluted 
with 95 pi ddH 2 0. DNA was obtained by lysis at 95 °C for 10 min, centrifu¬ 
gation at 1200 g for 3 min and discarding of the pellet. 

Standard PCR techniques were used to find positive BACs for each 
marker. PCR volumes were 10 pi and reactions contained 5 pg/pl DNA, 
0.195 pM of each primer, 0.14 U/pl Taq (Silverstar, Eurogentec, Belgium), 
1.071 mM TMAC1 (tetramethylammoniumchloride), 0.186 mM dNTP’s, 
2.15% DMSO and lx PCR buffer (lx PCR buffer contained 10 mM Tris- 
HC1 pH9.0, 1.5 mM MgCl 2 -6H 2 0, 50 mM KC1, 0.01% (w/v) gelatin and 


0.1% Triton X-100). PCR conditions were 95°C for 5 min, 35 cycles of 
95 0 C for 30 s, 45 ° C for 30 s and 72 ° C for 30 s, followed by 72 ° C for 4 min. 
If necessary, the annealing temperature was increased to 50 °C or 55 °C to 
decrease aspecific binding of the PCR primers. Products were separated 
using standard agarose gel electrophoresis (1.5% multi-purpose agarose, 0.5 x 
TBE buffer, 45 min, 120 V). In case of fragments smaller than 100 bp, a 4 % 
nussieve agarose gel was used instead of the standard agarose gel. 

For BACs identified in chromosome walking, BAC ends were sequenced 
and new PCR primers were developed on these sequences as described by 
Crooijmans et al. (2001). 

BAC fingerprinting and contig building 

Preparation of DNA, restriction endonuclease digestion with Hindlll, 
agarose gel electrophoresis and data acquisition (using the Image program; 
http://www.sanger.ac.uk/Software/Image/) were adapted from Marra et al. 
(1997). DNA preparation was performed using polystyrene “Uni-Filter 800” 
receiver plates (Polyfiltronics). DNA was resolved in 20 pi of T 5 E 0 . 1 . Individ¬ 
ual digestion brews contained 0.5 pi ddH 2 0, 1 pi buffer “R+”, 0.5 pi Hindlll 
(40 U/pl) and 8 pi BAC DNA. The size standard marker for gel electrophore¬ 
sis consisted of 46.6 pi Orange G, 6.0 pi Fermentas marker II, 0.8 pi Boehrin- 
ger marker V and 223.2 pi TE buffer. Gels were scanned on a BioRad FX 
scanner. 

Contig building based on the fingerprints was performed using the FPC 
program (http://www.sanger.co.uk/Software/fpc/v6; Soderlund et al., 2000). 
All marker and BAC data from the BAC anchor map were loaded into the 
FPC database. Based on preliminary experiments, tolerance was set to 4 and 
flagged as variable. FPC was run three times with cutoff values 10e _1 °, 10e -12 
and 10e -14 . The DQ-er was run for contigs with two or more Q-clones. Man¬ 
ual editing was not performed. 

Perl objects were developed to facilitate descriptive analysis of the FPC 
results and comparison of the outcomes using the three different cutoff val¬ 
ues. 

Full-length BAC sequences for length comparison in Table 2 were down¬ 
loaded from the Comparative Vertebrate Sequencing website from the NIH 
Intramural Sequencing Centre website (http://www.nisc.nih.gov) on April 
29,2003. 

Fluorescent in situ hybridization 

Metaphase spreads were obtained from 9-day old embryo fibroblast cul¬ 
tures, synchronized with 0.06 pg/ml colcemid (Gibco BRL) and fixed by 
standard procedures. 

Two-colour FISH for six labeled probes (see Table 4) was performed 
according to Morisson et al. (1998). 


Results 

BAC anchor map 

Genetic positions of BACs were determined by PCR screen¬ 
ing of the BAC library for two sets of DNA markers. The first 
set consisted of microsatellite markers with known location on 
the consensus linkage map. The second set consisted of se¬ 
quence tagged sites (STSs) generated by BAC end sequencing in 
chromosome walking experiments (Crooijmans et al. 2001; 
Buitenhuis et al. 2002; Jennen et al. 2002, in review). The 
genetic position of set 2-markers is known indirectly, because 
walking experiments initiated at markers mapped to the genet¬ 
ic map. 

Figure 1 provides an overview of the genetic positions of the 
BAC anchors; i.e. markers that directly link one or more BACs 
to the linkage map. 

BAC anchors could be identified for 34 linkage groups. 
These linkage groups represent all of the macrochromosomes 
(GGA1-GGA8 and GGAZ), several microchromosomes 
(GGA9-GGA20, GGA23, GGA24 and GGA26-GGA28) and 
ten smaller linkage groups. The exact number of microchro- 
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Fig-1 . Overview of the genetic positions of the BAC anchors for the first set (i.e. markers with a known location on the genetic 
map; triangles). Dots represent markers for which the cytogenetic position has been assessed. The ticks on the axis for each chromo¬ 
some/linkage group represent 10 cM-steps. 


mosomes covered is not yet known exactly because some of the 
microchromosomes might be represented by more than one 
linkage group. 

An overview of the anchoring results is summarized in 
Table 1. In total, 1,522 markers identified 2,983 distinct BACs. 


The average amount of positive BACs per marker was 3.8. For 
set 1, we were unable to identify a BAC for 205 markers. 
Eighty-two percent of these could be attributed to markers that 
were not optimized for PCR screening (e.g. negative genomic 
control or amplification of aspecific bands; data not shown). 
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Table 1. General overview of mapping results. 
A distinction is made between markers from set 1 
and set 2 (i.e. markers from the genetic map and 
markers from BAC end sequencing, respective¬ 
ly). 



No. of 

No. of marker- 

No. of distinct 


markers 

BAC pairs 

BACs 

set 1 

652 

1,791 

1,569 

set 2 

870 

3,960 

1,857 

total 

1,522 

5,751 

2,983 


Table 2. Comparison of BAC lengths, as calculated by fingerprinting and 
sequencing 


BAC 

Length by 
fingerprinting (bp) 

Length by 
sequencing (bp) 

Difference (bp) 

WAG-3 8H9 

71,996 

89,724 

-17,728 

WAG-5 5C14 

87,671 

55,589 

32,082 

WAG-65N20 

94,593 

109,569 

-14,976 

WAG-68G2 

131,268 

110,417 

20,851 

WAG-69H2 

116,545 

135,273 

-18,728 

WAG-71G10 

104,076 

120,464 

-16,388 

WAG-77D19 

108,195 

144,092 

-35,897 

WAG-93 J15 

115,703 

136,822 

-21,119 

WAG-100N11 

129,663 

144,369 

-14,706 

WAG-105M15 

81,541 

98,793 

-17,252 

WAG-126P17 

103,956 

129,285 

-25,329 


Table 3. Summarized results of contig building using FPC for three dif¬ 
ferent stringencies (with 10e -14 most stringent). Anchored contigs are contigs 
that contain at least one BAC from the BAC anchor map. 


Cutoff 

10e' 10 

10e‘ 12 

10e- 14 

No. of contigs 




anchored 

609 

597 

521 

not anchored 

4,598 

6,012 

6,457 

No. of contigs with > 10 BACs 




anchored 

286 

145 

68 

not anchored 

845 

639 

334 

No. of BACs 




in anchored contigs 

9,256 

4,968 

2,991 

in non-anchored contigs 

32,615 

31,357 

26,387 

as singeltons 

7,419 

12,965 

19,912 


A comprehensive list of marker-BAC pairs can be found as 
supplemental material at the journal’s website (Supplemental 
Table 1, www.karger.com/doi/10.1159/000075766). 

BAC fingerprinting and contig building 

HintiVW digestion of all 50,208 BACs of the Wageningen 
BAC library resulted in 49,290 high-quality fingerprints 
(98.2%). The average number of bands per BAC was 21 ± 5; 
average size of a band was 4241 ± 2873 bp. In consequence, 
the average BAC size was 89 ± 20 kb. The BAC sizes for 11 
BACs were compared to BAC full-length sequences that are 
already available. On average, the BAC length based on the 
sequence was 12 ± 20 kb larger than the length as calculated 
from the BAC fingerprints. Length data are presented in 
Table 2. 

The FPC program was used to build contigs based on these 
fingerprints. Manual editing was not performed. Results of the 
automated contig assembly are summarized in Table 3. 

A full list of BACs and their assigned contigs is available at 
the journal’s website (Supplemental Table 2). For each BAC, 
this list gives the contig names with FPC cutoff values 10e -10 , 
10e -12 and 10e -14 . The less stringent cutoff value 10e -10 results 
in large contigs with a higher chance for misassembly. These 
large contigs can be split with higher cutoff stringency. 

Using an in-house developed Perl script, the accuracy of 
each automatically assembled contig could be visualized. Fig¬ 
ure 2 shows an overview of a contig of the 10e -10 FPC assembly. 
For each BAC on the x-axis, the y-axis shows the most stringent 
threshold (of the FPC cutoff values lOe -10 ,10e _12 and lOe -14 ) at 
which that BAC is likely to overlap with the BAC at its left. 

Fluorescent in situ hybridization 

The BAC anchor map serves as a starting point for cytogen¬ 
etic mapping. Fillon et al. (in preparation) used these BACs to 
map genetic positions to cytogenetic positions. In Fig. 1, these 
BACs are indicated by dots. A complete list is provided as sup¬ 
plemental material. 

The cytogenetic position of six microsatellite markers on the 
genetic map of chromosome 4 was verified using positive BACs 
as probes for two-colour “caterpillar” FISH, i.e. WAG-112C24, 
WAG-125P16, WAG-118M14, WAG-33G16, WAG-12C6 and 
WAG-37E19 (see Fig. 3). All six probes could be clearly identi¬ 
fied on the chicken chromosome spread and were located in the 
same order as on the genetic map. 


Fig. 2. Accuracy of a contig of the 10e -10 
cutoff FPC assembly. For each BAC clone on the 
x-axis, the highest FPC cutoff stringency is given 
at which that clone is calculated to overlap with 
the clone at its left (1 = lOe -10 , 2 = 10e _12 , 3 = 
10e -14 ). This figure does not reflect actual clone 
order within the contig. 
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Fig. 3. Assessment of cytogenetic position of 
six microsatellite markers on chicken chromo¬ 
some 4. Probes used were BACs positive for the 
markers shown in the upper right corner of the 
figure. 



Table 4. BACs used as probes for two-colour 
FISH on GGA chromosome 4. For chromosomal 
position and genetic marker, the BAC that is used 
as a probe is specified. 


Position 

(cM) 

Marker 

BAC 

12 

ADL0317 

WAG-112C24 

75 

MCW0295 

WAG-125P16 

112 

ADL0246 

WAG-118M14 

128 

ROS0024 

WAG-33G16 

207 

MCW0180 

WAG-12C6 

243 

LEI0073 

WAG-3 7E19 


Discussion 

BAC anchor map 

Figure 1 shows that the BAC anchor map provides a broad 
coverage of the chicken genome. The BAC anchor map alone 
already covers about 8 % of the chicken whole-genome physical 
map; markers of set 1 can account for 5.7 %. On average, a BAC 
anchor is identified every 6.5 cM. The largest gap in the BAC 
anchor map is 58 cM and located on the q-arm of chromosome 
2. The marker density of the genetic map of some of the 
microchromosomes is too low to enable building BAC anchor 
maps for these chromosomes (e.g. E57, E58, WAU31 and 
WAU32). 

Multiple reasons exist for the fact that several markers could 
not identify a positive BAC. First, coverage of 5.6 means that 
approximately 99.2% of the genome is represented (Crooij- 
mans et al., 2000). Therefore, markers in the remaining 0.8% 
will not identify any clones. Second, the markers were devel¬ 
oped for fluorescent genotyping of large populations for linkage 


and QTL studies. Using fluorescent dyes, a small quantity of 
amplified DNA is sufficient to be detected. In several cases, the 
quantity of the complete amplification product of a marker is 
still too low to be visible in an agarose gel screening system. In 
addition, many markers amplified too many aspecific bands, 
which made detection of the right PCR band impossible, or 
amplified products smaller than 100 bp. These small fragments 
are difficult to call on a standard 1.5% agarose gel because of 
interference with primer-dimer bands. In case of spurious (e.g. 
many aspecific bands) or ambiguous results, the BAC was con¬ 
sidered negative. 

The average number of BACs per marker in set 2 is higher 
than in set 1, because markers of set 2 were developed specifi¬ 
cally for testing on agarose gels and not for fluorescent genotyp¬ 
ing. Furthermore, spurious results after agarose gel interpreta¬ 
tion for set 1 were discarded by definition, while those for set 2 
could often be resolved by incorporating other data from chro¬ 
mosome walking experiments. 

The BAC anchor map allows for integration and quality 
control of the genetic and physical maps. In total, 184 of the 
1,569 distinct BACs of set 1 in Table 1 were identified by more 
than one marker. Using this information, it is possible to assess 
the quality of the consensus genetic map. We identified six 
locations on the genetic map where markers that are several cM 
apart map to the same clone (i.e. WAG-11C21, WAG-32A2, 
WAG-41H14, WAG-54F10, WAG-83J16 and WAG-90M16). 
The BAC WAG-90M16, for example, is positive for markers 
ALD0105, MCW0271, ROS0075 and MCW0351, even though 
markers ROS0075 and MCW0351 are 9 cM apart (94 cM and 
105 cM on GGA8, respectively). This can be caused either by a 
recombination hotspot between 94 cM and 105 cM on this 
chromosome, or by an error in the genetic map. First, genotyp¬ 
ing errors result in erroneous location of a genetic marker. Sec¬ 
ond, some markers have a low number of informative meioses 
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or do not segregate in all 12 families used to construct the chick¬ 
en consensus linkage map. 

BAC fingerprinting and contig building 

Based on the fingerprinting results, we found an average 
BAC insert size of 89 kb, which is smaller than the estimated 
134 kb as published by Crooijmans et al. (2000) by pulsed-field 
gel electrophoresis (PFGE). Several reasons exist for this differ¬ 
ence. First, the size as calculated by fingerprinting is an under¬ 
estimation. Small digestion products are difficult to identify in 
the Image program and were not called. Second, it is often diffi¬ 
cult to detect co-migration of two or more digestion fragments. 
A third important reason is the inherently rough nature of frag¬ 
ment size estimation by PFGE. This can be attributed to the 
large difference in concentration between BAC and marker 
DNA, the fact that each band appears as a smear and, most 
important, the non-linear dependency of migration speed on 
fragment size. 

To further investigate the difference in average BAC size, 
we compared BACs that have already been sequenced with our 
fingerprinting data. A trifold comparison between sequence, 
fingerprints and PFGE data for the same BACs was not possi¬ 
ble, because the PFGE length assignments by Crooijmans et al. 
(2000) were performed on anonymous BACs before picking the 
BAC clones eventually making up the BAC library. On average, 
the sequenced lengths were larger by a factor of 1.11 than the 
lengths as calculated after fingerprinting. This confirms the sys¬ 
tematic underestimation also found by Le Hellard et al. (2001). 
However, for clones WAG-5 5C14 and WAG-68G2, the sizes as 
calculated based on restriction digestion fragment sizes are 
larger than the sizes as calculated by sequencing. For WAG- 
68G2, the fingerprint pattern is questionable, as it does not 
show the characteristic decrease in intensity with smaller frag¬ 
ment size. Interestingly, for WAG-55C14, in silico Hindlll 
digestion of the nucleotide sequence showed that fragment 
sizes were all smaller with a factor 1.10 to 1.25 compared to the 
fragments read from the agarose gel. Re-evaluation of the gel 
pattern of this clone and its adjacent markers confirmed consis¬ 
tent gel migration and correct band calling. 

Based on the fingerprints and the average underestimation 
by a factor 1.1 compared to the sequenced clones, our esti¬ 
mate for average insert size of the Wageningen BAC library is 
100 kb. 

As the coverage of the BAC library is too low to create large 
contigs, manual editing of the automated contig assembly by 
FPC was not performed. To enable contig assembly by FPC, 
either the resolution of the BAC library has to be increased by 
using additional enzymes (Tao et al., 2001), or the genome cov¬ 
erage (i.e. the number of BACs that are fingerprinted) has to be 
increased. Therefore, fingerprints are merged with fingerprint 
data of other chicken BAC libraries derived from Red Jungle 
Fowl that have been created by Washington University. The 
library collection consists of the Michigan State University 
TAM31 BamHl, TAM32 EcoRl and TAM33 Hindlll libraries 
and the CH261 EcoRl library (Children’s Hospital Oakland 
Research Institute). Contig building and manual editing of the 
combined fingerprinting data is currently in progress at Wash¬ 
ington University. Combining chromosome-walking data with 


the fingerprint contigs acts as a quality check for the Wagening¬ 
en fingerprints. Preliminary results show that clones that over¬ 
lap based on chromosome walking experiments also overlap 
based on fingerprints. 

The fingerprint data and contigs built by FPC on the Wage¬ 
ningen data speed up ongoing chromosome walking in two 
ways. First, because fingerprint fragment sizes are known for 
each BAC, the largest BAC can be selected for BAC end 
sequencing from a list of overlapping BACs. Second, finding a 
positive BAC for a BAC-end marker generally involves three 
steps: identifying a positive plate pool, identifying the positive 
row- and column-pools, and checking the individual BAC 
clone. Using results from FPC, and representations as in Fig. 2, 
it is often possible to skip the second step and check the indi¬ 
vidual BAC directly (data not shown). Marra et al. (1999) 
showed that the FPC program does a good job in putting over¬ 
lapping BACs in the same contig, but their order within the 
contig has to be corrected manually. Therefore, as no manual 
editing was performed, Fig. 2 reflects the possible relationships 
between clones, but not the clone order within the contig. The 
example in Fig. 2 clearly shows that at a stringency level of 
10e -10 the 44 BACs in the figure are within a single contig. 
Increasing the stringency to 10e -12 results in breaking up the 
contig into six smaller contigs and five unlinked BACs. When 
stringency is increased even more to 10e -14 , 31 BACs remain 
grouped in four contigs and 13 BACs become singletons. A pro¬ 
gram (named coral) is developed at the University of Vancou¬ 
ver to recalculate the actual clone order. This program was not 
yet available at the time of our analysis. 

Fluorescent in situ hybridization 

The BAC resource described in this paper has already prov¬ 
en to be of high importance for the integration of the linkage 
and cytogenetic maps (Schmid et al., 2000; Fillon et al. 2003). 
As a further illustration of the strength and possible applica¬ 
tions for this BAC resource, we performed a “caterpillar” FISH 
as shown in Fig. 3. The FISH experiment shows clearly that the 
data presented in this paper is an interesting resource for FISH 
mapping. Using this technology, both chromosomal location 
and orientation of the integrated genetic, physical and cytogen¬ 
etic maps can be identified. By comparing the position of the 
hybridization signal of probes at the end of the genetic map, 
relative to the telomeric ends, the distance between the ends of 
the genetic map and the physical map can be assessed. Chromo¬ 
some numbers and rearrangements can be accurately defined. 
Furthermore, although most chicken genetic markers cannot be 
used in other bird species, the chicken BAC clones are excellent 
probes in cross-species hybridization in other birds, for exam¬ 
ple FISH in golden pheasant, quail and turkey (Schmid et al., 
2000 ). 

In conclusion, our research allows for integration of multi¬ 
ple genomic resources in chicken, i.e. genetic, physical, cyto¬ 
genetic and sequence maps. This integration is a prerequisite 
for whole-genome sequencing, which is currently in progress. 
The BAC anchor map and Hindlll digestion fingerprints allow 
for chromosomal positioning of clone contigs in the construc¬ 
tion of the whole-genome physical map. FISH mapping using 
BACs of the BAC anchor map integrates the physical and cyto- 
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genetic maps. The BAC-end sequences will align shotgun¬ 
sequencing contigs to the physical map. These integrated 
resources will be valuable tools in genomic research before and 
after publication of the full chicken DNA sequence. 

The availability of the chicken DNA sequence will not only 
boost research in birds, but will also aid in the further annota¬ 
tion of the genomes of other species, in particular those of man 
and mouse. The easy access of the chicken embryo in combina¬ 
tion with the availability of a full set of molecular resources 
(ESTs, BACs, genome sequence) and a way of switching easily 


between different maps (i.e. the BAC anchor map) will also 
boost the use of chicken as a model species in developmental 
biology. 
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Abstract. Marek’s disease virus (MDY) is a naturally occur¬ 
ring oncogenic avian herpesvirus that causes neurological dis¬ 
orders and T cell lymphoma disease in domestic chickens. 
Identification and functional characterization of the individual 
factors involved in Marek’s disease (MD) resistance or patho¬ 
genesis will enhance our understanding of MDV pathogenesis 
and further genetic improvement of chickens. To study the 
genetic basis for resistance to MD, a strategy that combined 
protein-protein interaction screens followed by linkage analysis 
was performed. The MDV protein US 10 was used as the bait in 
an E. coli two-hybrid screening of a cDNA library derived from 
activated splenic T cells. The chicken LY6E, also known as 


SCA2 and TSAI, was found to specifically interact with US 10. 
This interaction was confirmed by an in vitro protein-binding 
assay. Furthermore, LY6E was found to be significantly associ¬ 
ated with MD traits in an MD resource population comprised 
of commercial chickens. Previously, LY6E was implicated in 
two independent DNA microarray experiments evaluating dif¬ 
ferential gene expression following MDV infection. Given that 
LY6E is involved in T cell differentiation and activation, we 
suggest that LY6E is a candidate gene for MD resistance and 
deserves further investigation on its role in MDV pathogenesis, 
especially with respect to the binding of US 10. 

Copyright©2003 S. Karger AG, Basel 


Marek’s disease (MD) is a lymphoproliferative disease of 
chickens induced by the Marek’s disease virus (MDV), a highly 
cell-associated a-herpesvirus (Churchill and Biggs, 1967; Na- 
zerian et al., 1968). Susceptible chickens infected with MDV 
develop lesions or tumors in nerve and visceral tissues, which 
leads to paralysis, blindness, and eventually death (Calnek and 
Witter, 1997). Due to its ubiquitous presence in the environ¬ 
ment, virtually every chick is exposed at hatch or shortly there¬ 
after. MDV enters an early cytolytic replication phase three to 
five days post-infection where the virus is observed mostly in 
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the B-lymphocytes along with a few activated T-lymphocytes 
(Shek et al., 1983; Calnek et al., 1984). At days six to seven 
post-infection, the virus switches to a latent phase, a character¬ 
istic of all herpesviruses (Shek et al., 1983). This switch is gen¬ 
erally associated with the development of an immune response 
in the host. Progression to a second cytolytic infection has been 
observed only in genetically susceptible birds at two to three 
weeks following infection. Lymphoproliferation and the devel¬ 
opment of T-cell tumors represent the final stage of MD patho¬ 
genesis (Buscaglia and Calnek, 1988; Calnek and Witter, 
1997). 

MD is a major chronic disease problem and poses a tremen¬ 
dous threat to the poultry industry. The annual losses world¬ 
wide from the costs of vaccination and MD-associated mortali¬ 
ty, meat condemnation, and reduction of egg production are 
estimated to approach $1 billion (Purchase, 1985). Since 1970, 
vaccines have been available that greatly limit MD incidence in 
the field (Witter, 1997). However, MD vaccines control rather 
than eliminate losses from MD since they do not block MDV 
infection and replication. Moreover, the increasing frequency 
of highly virulent strains of MDV that produce disease out¬ 
breaks may severely erode the efficacy of the vaccine (Witter, 
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1997). Hence, the development of genetically resistant birds, 
which are known to exist and can be selected for though not 
easily, is currently viewed as an especially promising approach 
for augmenting control of MD. 

Genetic resistance to MD is complex and controlled by mul¬ 
tiple genes of varying size of effect as well as being influenced 
by environmental factors (Bacon et ah, 2001). By utilizing satu¬ 
rated molecular genetic maps coupled with a pedigreed re¬ 
source population, genomic regions known as quantitative trait 
loci (QTL), which contain one or more genes responsible for 
complex traits, can be identified. Substantial progress has been 
made towards identifying numerous QTL that affect MD sus¬ 
ceptibility through the use of segregation analysis in intercross 
progeny between resistant and susceptible inbred chicken lines 
(Bumstead, 1998; Vallejo et al., 1998; Yonash et al., 1999). 
Unfortunately, primarily due to the small amount of phenotyp¬ 
ic variation each QTL explains, these QTL are resolved to large 
marker intervals that contain hundreds of genes. Furthermore, 
as MD traits are highly variable even in inbred lines (Burgess et 
al., 2001), identity-by-descent and other methods currently 
used in livestock species are probably not feasible to greatly 
reduce QTL resolution. Thus, alternative approaches for the 
efficient fine mapping of MD resistance QTL are required to 
identify positional candidate genes. 

Recently, we described such an alternative strategy for iden¬ 
tifying candidate genes associated with MD resistance (Liu et 
al., 2001b; Cheng, 2003). In this approach, a two-hybrid screen 
that detects MDV-chicken protein-protein interactions is inte¬ 
grated with linkage analysis. This strategy entails screening for 
chicken genes that encode proteins that directly interact with 
MDV proteins, and then testing to determine if the identified 
genes either lie in previously mapped QTL regions or are asso¬ 
ciated with MD resistance. The hypothesis under this strategy 
is as a virus, MDV must use the host cell machinery for prolifer¬ 
ation. Host-virus interactions, specifically at the protein level, 
may be directly involved in controlling the entry, replication, 
and spread of the virus, the host immune response, and apopto¬ 
sis or transformation of infected cells. The identification of host 
proteins that interact with viral proteins would in themselves 
provide potential insight into MD pathogenesis. Indeed, host 
proteins that are identified as interacting with specific viral 
protein could well be the products of disease resistance genes, 
our ultimate targets. In light of this, we combined protein-pro¬ 
tein interaction and linkage analyses to demonstrate that the 
chicken growth hormone (GH) gene is an MD resistance gene 
(Liu et al., 2001b). This was the first and until now, the only 
report on the identification of a candidate host gene associated 
with MD resistance. 

With the successful identification of GH as an MD resis¬ 
tance gene, we decided to extend the number of MDV genes 
examined. In this study, MDV US 10 was chosen as the bait in a 
bacterial two-hybrid screen of a cDNA library constructed 
from splenic T cells. US 10 is a tegument-associated protein and 
is encoded by an open reading frame coding for 213 amino 
acids (Brunovskis and Velicer, 1995). US 10 is dispensable for 
viral growth both in vitro and in vivo (Parcells et al., 1994). 
However, an interesting phenomenon of MDV recombinant 
strain RM1, which contains a REV LTR insertion upstream of 


the SORF2 gene, brought US 10 to our attention (Jones et al., 
1996). The perturbed gene expression caused by the LTR pro¬ 
moter was correlated with the loss of MDV oncogenicity in 
RM 1, and a specific transcript driven by the LTR was identified 
in the RM1 clone. Besides possessing SORF2 , this particular 
transcript contained US 10 (Jones et al., 1996). This finding pro¬ 
moted our further investigation of US 10 in MD pathogenesis. 

We report that chicken lymphocyte antigen 6 complex, 
locus E (LY6E, also named stem cell antigen 2 SCA2, and 
thymic shared antigen 1 or TSAI) specifically interacts with 
US 10. Furthermore, LY6E was found to be significantly associ¬ 
ated (P < 0.01) with length of survival and the incidence of 
proventricular tumors following MDV challenge in a commer¬ 
cial layer resource population. Moreover, the LY6E gene was 
implicated in two previous DNA microarray experiments fol¬ 
lowing MDV challenge between MD resistant and susceptible 
chicken lines (Liu et al., 2001a) or in cultured chicken embryo 
fibroblasts (Morgan et al., 2001). We conclude that LY6E is 
likely to be an MD resistance gene based on our current experi¬ 
mental study, prior experiments on differential gene expres¬ 
sion, and the function of its protein. 


Materials and methods 

E. coli two-hybrid screen 

The screen for MDV-chicken protein-protein interactions was performed 
with the BacterioMatch Two-Hybrid System (Stratagene, La Jolla, CA) 
according to the instructions provided by the manufacturer. Briefly, MDV 
US10 cloned into the pBT vector (pBT-USlO) was used as bait to screen a 
chicken spleen cDNA library. US 10 was PCR amplified and inserted in¬ 
frame into the pBT vector downstream of the binding domain between the 
EcoRl and Xhol restriction endonuclease sites. A chicken cDNA library 
derived from activated splenic T cells was fused with the pTRG target vector. 
The pBT-USlO plasmid and the cDNA library were co-transformed into 
competent E. coli cells and positive clones selected for carbenicillin resis¬ 
tance and (3-galactosidase ((3-gal) activity. The resulting transformants were 
confirmed by selection on new plates. Plasmids from positive colonies were 
isolated and subjected to DNA sequencing by dye-terminator fluorescence 
sequencing on an ABI 3100 automatic DNA sequencer (Applied Biosystems, 
Foster City, CA) using the pTRG sequencing primer described in the screen¬ 
ing kit. Sequence data were queried onto the public databases using the 
BLAST program. 

In vitro protein binding assay 

To confirm the interaction detected from the bacterial two-hybrid sys¬ 
tem, a glutathione S-transferase (GST) fusion protein with LY6E was gener¬ 
ated by amplifying the recovered LY6E cDNA and cloning in frame into the 
pGEX-5X vector (Amersham Biosciences, Piscataway, NJ). The GST-LY6E 
fusion protein was adsorbed on glutathione Sepharose 4B beads (Amersham 
Biosciences, Piscataway, NJ) and then used for the in vitro binding assay. 
The US10 sequence was amplified and cloned in frame into the pET28a 
expression plasmid, and protein synthesized in vitro using the Single Tube 
Protein System 3 (STP3) (Novagen, Madison, WI). One pg pET-USlO plas¬ 
mid was used as a template for a coupled T7-directed in vitro transcription- 
translation reaction in the presence of 40 pCi [ 35 S]-methionine. 10 pi of the 
reaction was incubated with Sepharose bead-adsorbed GST fusion protein or 
GST alone for in vitro protein binding assay. Following serial washing with 
1 % Triton X-100 in PBS, bound protein was eluted with 30 pi elution buffer 
(10 mM reduced glutathione in 50 mM Tris-HCl, pH 8.0). All the samples 
were subjected to 10% SDS-PAGE analysis followed by autoradiography. 

LY6E genotyping and linkage analysis 

The commercial resource population used in this study and the MD traits 
measured were previously described (Liu et al., 2001b). Briefly, the resource 


Cytogenet Genome Res 102:304-308 (2003) 


305 



GST only 


GST-LY6E 


S W1 W7 E S W1 W7 E 



Fig -1 . US 10 and LY6E interact in vitro. S (supernatant) shows the input 
of 35 S-labeled US 10 to the affinity columns. After one and seven washes (W1 
and W7, respectively), US 10 is retained and eluted (E) with the GST-LY6E 
fusion protein but not with the GST only control. 
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Fig. 2. Association analysis of LY6E genotype and length of survival 
(days). Backcross progeny of a commercial resource population were chal¬ 
lenged with 500 pfu MDV (strain 638) at one week of age, reared in an envi¬ 
ronmentally controlled house, and observed until moribund or 20 weeks of 
age. The MHC genotype was determined by serology for the B blood group. 
All chicks were necropsied and scored for nerve enlargement and gross 
tumors in visceral tissues. Statistical analysis between LY6E genotype and 
MD traits were by ANOVA for continuous traits and y 2 for nonparametric 
traits. The diamond in each LY6E genotype identifies the 95% confidence 
interval for the mean. Based on this analysis, the P < 0.01 that LY6E is 
associated with length of survival. 


Table 1. Effect of LY6E genotype on MD-associated traits in an MD 
resource population derived from commercial layers 


Trait 

LY6E genotype 



Homozygous 

Heterozygous 

P value 

Number of chicks 

120 

136 


Mean survival (days) 

67 

78 

0.01 

% with vagus lesions 

63 

61 

0.70 

% with sciatic lesions 

18 

22 

0.45 

% with heart tumors 

41 

37 

0.50 

% with gonad tumors 

9 

9 

0.92 

% with proventriculus tumors 

19 

9 

0.01 

Mean number of tissues with 

1.7 

1.6 

0.37 


tumors per chick 


population was a backcross (i.e., [Line 1 x Line 2] x Line 1) of two commer¬ 
cial Leghorn pure lines differing in relative resistance or susceptibility to 
MDV. Since each line contains a different LY6E allele, the backcross proge¬ 
ny were either heterozygous or homozygous for the line 1 (susceptible) allele. 
Detection of single nucleotide polymorphism (SNP) for LY6E between the 
parental lines and genotyping of the progeny was the same as previously 
described in Liu and Cheng (2003). Statistical analysis between LY6E geno¬ 
types and the MD traits from 256 phenotypically selected progeny were ana¬ 
lyzed by ANOVA for continuous traits (e.g., survival) and y 2 for nonparamet¬ 
ric traits (e.g., nerve lesions) using JMP (SAS Institute, Cary, NC). 


Results 

In order to identify functional polypeptides that interact 
with the MDV US 10 protein, an E. coli two-hybrid assay was 
employed using a chicken cDNA library derived from splenic T 
cells. In total, ~1.2 x 10 6 E. coli transformants were screened 
on selection plates. Ten clones were found to be positive after 
two rounds of selection for carbenicillin resistance and p-gal 
activity. A database search and sequence alignment identified 
one prey plasmid containing an insert of 378 nucleotides with 
100% identity to the entire coding sequence (126 amino acids) 
of chicken LY6E (GenBank accession no. L34554). The other 
inserts did not exhibit significant homology to any other known 
proteins in the current database. 

Biochemical binding assays using purified proteins were 
performed to confirm the interaction between LY6E and US 10 
that was detected in the two-hybrid screen. The GST-fusion 
protein, with or without the LY6E, was incubated with in vitro 
translated US 10. After extensive washing, US 10 was retained 
by the GST-LY6E protein, but not by the GST alone (Fig. 1). 
This result suggests that US 10 is sufficient to specifically inter¬ 
act with LY6E. 

Previously, the East Lansing mapping population was used 
to map LY6E near to the distal end of chromosome 2 (Liu and 
Cheng, 2003). This location does not lie near any of our pre¬ 
viously mapped MD resistance QTL found in experimental 
lines (Vallejo et al., 1998; Yonash et al., 1999). To examine if 
LY6E is a candidate gene for MD resistance in other popula¬ 
tions, an association study was conducted in a commercial 
White Leghorn resource population. This population was de¬ 
rived from matings between two pure lines that exhibit differ¬ 
ences in MD resistance (Line 1 is MD susceptible, and Line 2 is 
MD resistant) and have relatively high levels of inbreeding (60- 
80%). The same two substitutions at nucleotides 811 and 814 
detected in the East Lansing reference panel were found 
between the two commercial lines: 5 / -CTCACT(C/G)AA(C/ 
G)TGT. Line 1 and Line 2 have C and G, respectively, for 
bases at both SNP. The presence of the same SNP at position 
811 enabled us to use the same PCR-RFLP genotyping scheme 
used to map LY6E (Liu and Cheng, 2003). As shown in 
Table 1, statistical analysis using 256 selected progeny at the 
phenotypic extremes revealed significant associations (P < 
0.01) of the LY6E SNP with length of survival following MDV 
challenge and the incidence of tumors in the proventriculus. 
The amount of phenotypic variation ( R 2 ) explained by LY6E 
was 3% for both traits. Figure 2 shows the distribution of mea¬ 
surements for survival length in days across each genotype. 
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Discussion 

MDV is a naturally occurring oncogenic avian herpesvirus 
that causes lymphoproliferation and induces neoplastic disease 
in chickens. The MDV genome is a double stranded linear 
DNA molecule of approximately 180 kb with a capacity to 
encode more than 90 functional proteins (Lee et al., 2000; Tul- 
man et ah, 2000), although some of them are putative. An 
important goal of research on Marek’s disease is to identify 
viral and cellular determinants, and the ensuing biological 
pathways that promote MDV oncogenesis or disease resistance. 
One approach to reach this goal is to identify and characterize 
host proteins that interact with viral proteins. Clearly, MDV 
can interact differently with various cell types that cause pro¬ 
ductive infection, latent infection, or cellular transformation. 
At the level of molecular interactions, namely protein-protein 
interactions, host proteins are likely to be involved in the 
immune response, disease pathogenesis and resistance, and the 
interaction with MDV proteins may alter the response or func¬ 
tion. We previously demonstrated that combining a protein- 
protein interaction screen with genome mapping was a power¬ 
ful strategy for identifying growth hormone as an MD resis¬ 
tance gene (Liu et al., 2001b). With the recent introduction of 
two-hybrid systems based in E. coli , which facilitates screening 
for protein-protein interactions, we have extended our efforts 
to other MDV genes. 

In this study, we identified that the chicken LY6E specifi¬ 
cally interacts with MDV US 10 by two-hybrid screening and 
confirmed this protein interaction with an in vitro protein¬ 
binding assay. To reveal the potential of LY6E to be a candi¬ 
date QTL for MD resistance, a genetic approach using 256 
backcross chickens selected from a commercial layer resource 
population was carried out. LY6E was found to be significantly 
associated with MD resistance, specifically for the length of sur¬ 
vival following MDV challenge and the extent of tumor inci¬ 
dence in the proventriculus (L > <0.01). These two results togeth¬ 
er strongly suggest that LY6E is an MD resistant gene and a 
deserving candidate for further investigation of its potential 
role in MD pathogenesis and disease resistance. 

Our conclusion that LY6E is an MD resistance gene is sup¬ 
ported by two previous DNA microarray studies that evaluated 
LY6E expression with respect to MD. In the first experiment 
using 1,200 selected genes from lymphocytes, LY6E was one of 
four genes repeatedly found differentially expressed between an 
MD resistant line (Line 6) and an MD susceptible line (Line 7) 
following MDV infection (Liu et al., 2001a). In an independent 
trial using the same set of genes, Morgan et al. (2001) reported 
that the LY6E was repeatedly induced two to three-fold follow¬ 
ing infection of chicken embryo fibroblasts with oncogenic 
MDV. These two results are consistent as Line 7, which sup¬ 
ports higher MDV titers, had significantly higher amounts of 
LY6E expression compared to Line 6, where MDV viremia lev¬ 
els are lower, indicating that levels of LY6E and MDV infec¬ 
tion are correlated. This suggests a functional relationship. 

LY6E was first cloned in mouse and called thymic shared 
antigen 1 or Tsai (MacNeil et al., 1993; Classon and Cover- 
dale, 1994). It is a member of the Ly-6-cell surface proteins. 
The Ly-6 superfamily of molecules are differentially expressed 


in several hematopoietic lineages and appear to function in sig¬ 
nal transduction and cell activation. The chicken LY6E homo- 
logue was first identified as one of several genes that were up- 
regulated in bone marrow cells transformed with v-Rel, a NF- 
kB transcription factor (Petrenko et al., 1997). Like most Ly-6 
proteins, LY6E is anchored to the cell membrane by a glycosyl- 
phosphatidyl-inositol (GPI) moiety. 

The exact function of LY6E is not known, and it is suggested 
that LY6E may have different functional roles compared to 
other Ly-6 proteins in the immune system since the level of 
amino acid similarity of LY6E to other family members is rela¬ 
tively low (Fleming et al., 1993). Functional studies using 
monoclonal antibodies to LY6E have suggested that LY6E acts 
both as a positive and negative signal when it interacts with the 
T-cell receptor (TCR) complex, which in turn can induce inter¬ 
leukin 2 (IL-2) production and T cell proliferation (Saitoh et al., 
1995). The signaling of LY6E may be mediated by a physical 
interaction with FcyRIIB, the immunoglobulin receptor that is 
part of the regulatory mechanism for inhibiting the activation 
of B cells, mast cells, macrophages, and neutrophils. SF20/IL- 
25, a novel secreted bone marrow stroma-derived growth fac¬ 
tor, has been shown to bind to LY6E and is sufficient to stimu¬ 
late cells to proliferate in a dose-dependent manner (Tulin et 
al., 2001). 

LY6E is expressed by peripheral B cells, immature T cells, 
activated T cells, thymus stromal cells, and macrophage. Devel¬ 
opmental regulation of LY6E expression has been reported, 
which suggests that LY6E plays a regulatory role in thymocyte 
maturation (Randle et al., 1993). This hypothesis is supported 
by studies that show LY6E influences anti-TCR/CD-3-induced 
apoptosis in immature thymocytes (Noda et al., 1996), and 
antibody to LY6E alters thymocyte differentiation in vitro 
(Waander et al, 1989). However, this conclusion is contradicted 
by a report that shows LY6E knockout mice have normal T and 
B cells (Zammit et al., 2002). 

The functional involvement of LY6E in T cell differentia¬ 
tion and activation, the correspondence of cells that express 
LY6E and are infected by MDV, and the ability of MDV US 10 
to specifically bind LY6E suggest a potential role for LY6E in 
MDV pathogenesis. Activated T cells support MDV infection 
(Calnek et al., 1984) and the majority of transformed T cells 
identified are activated helper CD4+/CD8- T cells (Schat et al., 
1982). Thus, LY6E expression may be associated with the 
determination of target cells for viral infection. The binding of 
US 10 to LY6E could alter apoptosis and developmental regula¬ 
tion in thymocytes. If the physical binding of LY6E and MDV 
US 10 does occur in vivo, it is possible that this interaction will 
lead to protection from apoptosis in MDV infected T cells. It is 
well recognized that many viruses including numerous herpes¬ 
viruses produce proteins to inhibit apoptosis as a defense 
mechanism (e.g., Derfuss and Meinl, 2002). US 10 may in¬ 
fluence the ability of LY6E to modulate Fc receptor functions, 
especially those involving phagocytosis and antigen presenta¬ 
tion. It is important to note that growth hormone and other 
confirmed chicken proteins that we have identified, interact 
with one or more MDV proteins to influence either gene regula¬ 
tion and/or the immune response (unpublished). If one or more 
of these possibilities are true, then this may argue that RM1 
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may be attenuated by the perturbed expression of both SORF2 
and US 10, which potentially influence growth hormone and 
LY6E activities, respectively. To address these possibilities, as 
well as to confirm the interaction of LY6E and US 10 in vivo, 
we are generating antibodies against chicken LY6E, which can 
be used in future experiments to evaluate LY6E and US 10 co¬ 
localization, LY6E activity, etc. 

In conclusion, we have reported that the LY6E protein 
interacts with MDV US 10 protein, and that the LY6E gene is 
associated with MD resistance. Combined with the fact that 
LY6E is differentially expressed in lymphocytes between MD 
resistant and susceptible chickens, and its role in T cell devel¬ 
opment and the immune response, we suggest that LY6E is an 
MD resistance gene. This information enhances our under¬ 
standing of how the chicken immune system may respond to 
MDV infection. This is a second example demonstrating the 
power of combining proteomic functional screens to yield can¬ 


didate genes that can be tested by genomic techniques. We 
argue that this combination is a powerful strategy for identi¬ 
fying the molecular determinants of complex traits, especially 
those involved in infectious disease resistance. And while the 
gene effects for both LY6E and GH are relatively small, combi¬ 
nations of two or more MD resistance genes may have substan¬ 
tial biological and economical effects, especially given the evi¬ 
dence of epistasis (Vallejo et al., 1998). Experiments are under¬ 
way to conduct a comprehensive screen of MDV-chicken pro¬ 
tein interactions to reveal additional MD resistance genes and 
shed more information on MDV pathogenesis pathways. 
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Abstract. Telomerase RNA (TR) is essential for telomerase 
activity and the maintenance of telomere length in proliferating 
cell populations. The objective of the present research was to 
define the cytogenetic and molecular genomic organization of 
chicken TR (chTR). The chTR exists as a single copy gene 
(TERC, alias TR), mapping to chromosome 9 (GGA9). The 
loci on the q arm of GGA9 map to three chromosomes in 
human with five of the nine GGA9q loci mapping to HSA3q. 
Sequencing of the chTERC locus (3,763 bp) from the UCD 001 
genome (Red Jungle Fowl) included: 604 bp 5', 465 coding, and 
2,694 bp y (from -604 to +3159). Sequence analysis included 
homology searches conducted on several levels including com¬ 
parisons among different chicken genotypes, Marek’s disease 
virus (MDV) sequences, plus human and murine. We provide 
evidence for distal 5 7 and 3 7 sequence homology between 


chTERC and the MDV genome among other known regions of 
homology (promoter and coding), elaborate on 5 7 transcription 
factor binding motifs among the various genomes as well as 
show type and number of TERT-related motifs 3 7 of chicken 
TR (e.g., Spl, c-Myb, c-Myc, AP2, among others). Surrounding 
the gene are more than 25 Spl sites, over 20 oncogene tran¬ 
scription factor binding motifs and numerous hormonal and 
other specialized binding motifs. Knowledge of 5 7 and 3 7 
chTERC regulatory elements will be useful for investigating 
normal control mechanisms during growth and development as 
well as investigating the potential for dysregulation of this 
important gene during oncogenesis, especially among different 
genotypes. 

Copyright©2003 S. Karger AG, Basel 


The telomeric DNA of eukaryote linear chromosomes is 
replicated by telomerase, a dedicated ribonucleoprotein 
(Greider and Blackburn, 1985, 1989; Yu et al., 1990). The 
enzyme consists of two components, an RNA subunit (TR, also 
known as TERC) which provides various binding and template 
functions, and a protein subunit (TERT) having catalytic func¬ 
tion. TERT, a specialized member of the reverse transcriptase 
family, copies the template region of TR, specifically extending 
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the length of the 3 7 end of the telomere. Extension of the 3 7 end 
results in the synthesis of longer complimentary daughter 
strands upon replication. In the absence of telomerase activity, 
each round of replication results in a gap at the telomere follow¬ 
ing removal of the most terminal 5 7 RNA primer from the new¬ 
ly synthesized daughter strand (known as the “end-replication” 
problem, Olovnikov, 1973). Eventually, telomere shortening 
reaches a threshold, triggering cytogenetic aberrations, apopto¬ 
sis and replicative senescence, and thus providing a genetic 
mechanism to limit cellular lifespan (Counter et al., 1992; Karl- 
seder et al., 1999). A key discovery was that abrogation of the 
telomere driven mitotic-clock correlates with cellular immor¬ 
talization and human cancers (Shay and Bacchetti, 1997). 

Regulation of telomerase activity in somatic cells and the 
extent of a telomere clock mechanism governing cell prolifera¬ 
tion/immortalization, vary among vertebrates. Telomerase ac¬ 
tivity is developmental^ regulated in human (Counter et al., 
1992) and chicken (Taylor and Delany, 2000) along with age- 
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associated in vivo telomere shortening (Harley et ah, 1990; 
Taylor and Delany, 2000). Age-associated telomere shortening 
is also indicated for sheep, pig and cow (Kozik et ah, 1998; 
Shiels et ah, 1999; Lanza et ah, 2000), whereas in the lab mouse 
(Kipling and Cooke, 1990; Coviello-McLaughlin and Prowse, 
1997) and Xenopus (Bassham et ah, 1998) telomerase activity 
is constitutive in adult somatic tissues and the telomere-driven 
mitotic clock is not a primary mechanism controlling cellular 
lifespan. However, telomerase knockout mice exhibit various 
reproductive and immunological defects following five to six 
generations of breeding (Lee et al., 1998; Hemann et al., 2001). 
The chicken shares many features in common with human, e.g., 
age-associated telomere shortening (in vivo) (Delany et al., 
2000), telomerase downregulation in most somatic cells/tissues 
during embryonic and postnatal development (Taylor and De¬ 
lany, 2000) and telomerase activity in transformed but not pri¬ 
mary cells in vitro (Swanberg and Delany, 2003). Notably, 
human and chicken cells share the feature of a general cellular 
resistance to spontaneous immortalization/transformation 
(Lima et al., 1972; Forsyth et al., 2002; Delany et al., 2003). 

Regulation of telomerase activity through TR and TERT 
expression involves a complex network of transcription factors 
and cA-acting sequences and to date, is best understood in 
model organisms (e.g., single-cell ciliates, yeast) and human 
(Mergny et al., 2002). TERT gene organization has been 
described in several species; the gene is large (ca. 40 kb in 
human) containing multiple reverse transcriptase motifs along 
with a telomere-specific motif (Cong et al., 1999; Wick et al., 
1999). A comparison of vertebrate TERC gene organization 
(Chen et al., 2000) indicates primary sequence variation in 
gene length and sequence (range of 382-550 bp in hamster and 
dogfish shark, respectively), yet a highly conserved secondary 
structure is evident. Interestingly, a series of GenBank se¬ 
quence deposits during the year 2000 (e.g., AF331499) and 
research articles (Lee et al., 2000; Tulman et al. 2000) provided 
sequence data indicating that Marek’s disease virus (MDV) 
genomes contain TERC sequences. MDV, a DNA herpesvirus, 
induces a T-cell derived malignancy in chickens and is related 
to the Epstein-Barr virus involved in Burkitt’s lymphoma 
malignancies in humans. Recently, Fragnet et al. (2003) 
showed that the 5' promoter region, in addition to the coding 
region of TERC, exists within oncogenic MDV strains as well 
as provided evidence for viral-derived TERC expression in 
peripheral blood leukocytes of infected birds. 

The present research describes the cytogenetic and genomic 
organization of the chicken TERC locus, specifically the map 
localization of TERC including the elaboration of a conserved 
humamchicken synteny group, as well as sequence features 
including conservation of regulatory elements among known 5 7 
TERC sequences. Further, new 3' sequence information is pre¬ 
sented including identification of regulatory motif elements 
known to be involved in the regulation of TERT and evidence 
that in addition to 5' and coding region sequences, 3' regions of 
chTERC exist in oncogenic MDV genomes. Upstream and 
downstream sequence should prove useful for elucidation of 
the functionality of regulatory elements controlling chTERC 
gene expression in cells in vitro and in vivo under normal and 
progressive disease conditions. 


Materials and methods 

Genetic resources 

PCR and sequencing were performed using DNA from the inbred line 
UCD 001 (Red Jungle Fowl, Galius gallus gallus) (Pisenti et al., 2001). The 
BamHl chicken BAC library was prepared from UCD 001 DNA (Lee et al., 
2003). Metaphase chromosomes were derived from chick embryo fibroblast 
(CEF) cultures established from El 1 embryos of a commercial layer produc¬ 
tion stock (Fly-Line International). 

Sequencing 

Inverse PCR (iPCR) (Sambrook and Russell, 2001a) was utilized to 
create genomic DNA templates for amplification of the 5' and 3' regions of 
the chTERC gene. Genomic DNA (5 pg) from UCD 001 and UCD 003 were 
digested with Rsal, Hinfl, and Pstl at 37 °C overnight. Restriction digests 
were self-ligated at 4°C overnight with T4 ligase (Promega). Chicken TERC 
sequences were selectively PCR amplified with primers TRp7F (5'-CA- 
AAAAAACGTCAGCGAGGGGTCCG-39 and TRpl2R (5-GGAGCGC- 
GGCGACAGCACCATCAA-3 ) designed from the chTERC coding se¬ 
quence (Chen et al., 2000) under the following reaction conditions: lx Her- 
culase buffer (Stratagene), 6% DMSO, 0.2 mM dNTPs, 0.4 mM of each 
primer, 1.5 U Herculase Polymerase, and 10 ng digested/self-ligated DNA. 
Cycling parameters were as follows: 98 °C 2 min, 35 cycles of 98 °C 40 s, 
X°C 30 s, 72°C 5 min with a final extension of 15 min. The annealing tem¬ 
peratures (X 0 C) were 63 ° C (Rsal), 56 0 C (Hinfl) or 64° C (Pstl). Amplicons 
were resolved by a 1 % 0.5 x TBE gel, purified with Qiaquick Spin Gel Extrac¬ 
tion columns (Qiagen) then directly sequenced on an ABI 377 machine (Da¬ 
vis Sequencing). Sequence analysis was performed using DNAstar software 
(Lasergene). Transcription factor binding sites were assessed by means of 
TESS (Schug and Overton, 1997; Wingender et al., 2000) using combined 
string search default parameters; homology searches were conducted using 
standard nucleotide BLAST or pairwise BLAST (http://www.ncbi.nlm.nih. 
gov/) with default parameters and repetitive regions located by Repeat Mask¬ 
er (http://searchlauncher.bcm.tmc.edu/). 

Probes for BAC library screening 

A 1.3 kb fragment containing chTERC and the region 3' was amplified 
from 10 ng UCD 001 genomic DNA in a 50 pi reaction with the following 
conditions: 1 x Platinum Taq buffer (Invitrogen), 4 % DMSO, 1 mM MgCl 2 , 
0.2 mM dNTPs, 20 pmol each of primers TRpl IF (5 -CC AT AGCGGGGC- 
GGCAGCG-3 7 ) and TRp26R (5 / -CCACGTGTTCCGGCAATGAG-3 / ) 
and 2 U Platinum Taq (Invitrogen). Thermal cycling parameters were as fol¬ 
lows: 94°C 2 min, 35 cycles of 97°C 40 s, 58°C 30 s and 72°C 2.5 min, 
followed by a 5 min 72 0 C extension. Fragments were resolved on a 1 % 0.5 x 
TBE agarose gel and purified with Qiaquick Spin Gel Extraction columns. 
Additionally, a portion of the chTERC coding region was PCR amplified 
using primers TRp5F (5 / -CCCTCCGCCCGCCCGCTGTTTTAC-3 / ) and 
TRp6R (5 / -GGCGCCGCTCCCGTTTGCTCTGCT-3 / ) (330 bp amplicon) 
under the same conditions as for TRpl 1-26, with the exception of a 64°C 
annealing temperature and a 1 min 72 °C extension. 

BAC library screening 

UCD 001 BamHl BAC library filter sets were prehybridized with 0.5 M 
Na 2 HP0 4 , 7 % SDS, 1 % BSA, 1 mM EDTA for 3 h at 65 0 C. The TRpl 1-26, 
TRp5-6 amplicons were pooled and labeled with [a- 32 P]-dCTP (ICN) by nick 
translation (Promega), incubated at 15 ° C for 1 h. The reactions were puri¬ 
fied through NucTrap columns (Stratagene), denatured at 94 °C for 10 min, 
snap cooled and used for hybridization with the filters at 65 °C overnight. 
Post-hybridization washes included a 2x SSC, 0.2% SDS and two times 0.5x 
SSC, 0.2% SDS for 15 min each at 65 °C. Hybridized membranes were put 
under Kodak MR Film for 3-5 days at -70°C. Five positive clones were 
identified, grown and isolated according to the TAMU GENEfinder protocol 
(http://hbz.tamu.edu/bacindex.html) or by Qiagen Midiprep 100 columns. 
Each clone was re-screened individually with each probe (separately) using a 
slot-blot apparatus (20 ng BAC DNA/well) according to the same hybridiza¬ 
tion protocol used to screen the BAC filters. 

Confirmation ofTR-BAC identity 

Restriction enzyme digests of one BAC clone (TR-BAC29) were per¬ 
formed using four restriction endonucleases that did not cut within the 
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chTERC gene but cut at least once upstream and downstream. Six units of 
BamRl, BstYl, Pstl, and Rsal were used individually to digest 200 ng of BAC 
DNA overnight at 37 °C. The resulting fragments were separated on a 1 %, 
0.5 x TBE gel and Southern blotted (Sambrook and Russell, 2001b). 400 ng of 
the TRp5-6 probe (330 bp) was denatured for 10 min at 94 °C and snap 
cooled. The fragment was labeled with [a 32 P]-dCTP (ICN) by terminal di- 
oxynucleotidyl transferase (Pharmacia) and purified through NucTrap puri¬ 
fication columns (Stratagene). The membrane was prehybridized with 0.5 M 
Na 2 HP0 4 , 5 % SDS at 65 ° C for 3 h then hybridized with radiolabeled probe 
overnight at 65 °C. Post-hybridization washes consisted of 2x SSC, 0.2% 
SDS, 0.5x SSC, 0.2 % SDS and 0.1 x SSC, 0.2 % SDS at 65 0 C for 15 min each 
followed by exposure to BioMax MR Film at -70 °C. Subsequently, a 1.45- 
kb, TRp5-6 positive Rsal fragment was gel purified using Qiaquick Spin Gel 
Extraction columns. PCR was performed with 10 ng of the 1.45-kb Rsal frag¬ 
ment and TR-BAC29 (each used separately as a template) with primers 
TRp5F and TRp6R and the Advantage GC-Genomic Kit (Clontech) accord¬ 
ing to the manufacturer’s protocol (68 °C anneal-extension). The resulting 
amplicon fragment was visualized on a 2% 0.5x TBE agarose gel and puri¬ 
fied with Qiaquick Spin Gel Extraction columns for direct sequencing (Davis 
Sequencing) on an ABI 377 machine. 

Cytogenetic mapping by fluorescence in situ hybridization 

Cytogenetic localization experiments were conducted according to meth¬ 
ods described in Daniels and Delany (2003) with the following changes-spec- 
ifications: (1) no RNase treatment, (2) a 1.2 min chromosome denatur- 
ation period, (3) post-hybridization washes consisted of a 2 min incubation 
in 1 x PBS/0.1 % Tween-20 at 57 0 C and a 1 min incubation in 2x SSC, 0.1 % 
Tween-20 at room temperature, (4) the TR-BAC29 was labeled with DIG-11 
dUTP (Roche) by nick translation at 15 °C for 15 h, (5) UCD 001 5S rDNA 
clones (2.2 kb insert containing coding and intergenic spacer regions, (Dan¬ 
iels and Delany, 2003) were labeled with Biotin-16 dUTP (Roche) by nick 
translation (Promega) for 5 h at 15 °C and (6) probe detection utilized 0.7 pg 
of anti-DIG FITC (Roche) and/or 0.63 pg of avidin Texas Red (Vector Labs). 

Results 

TERC sequence features 

GC content. A total of 3,763 bp of genomic DNA from UCD 
001 containing the chTERC gene was sequenced which in¬ 
cluded 604 bp located 5', the 465 bp coding region, and 2,694 
bp downstream (3 7 ) of the gene (GenBank Accession 
AY312571). GC content was 57.6% for the entire sequence 
(-604 to +3159). The region 5' of the gene (-604 to -1 bp) was 
67.1 % GC, the coding region was 77.0% GC (+1 to +465) and 
the y downstream region 52.2% GC (+466 to +3159). Within 
the coding region, a low complexity G-rich region was identi¬ 
fied by Repeat Masker located at positions +130 to +200 (also 
noted by Chen et al., 2000) and from positions +338 to +378. 
These regions correspond to the conserved vertebrate TR 
domains between CR2 and CR3 (psuedoknot region) and 
between CR5 and CR6 (hypervariable paired region), respec¬ 
tively. Tests for CpG islands within the chTERC sequence 
(EMBOSS, http://www.ebi.ac.uk ) indicate the presence of a 
major island(s) along 2000 contiguous bases, from -556 to 
+ 1436. 

March's disease virus sequence homology: 5', coding and 3'. 
The top BLAST search hit for the 3,763 bp chTERC sequence 
was the 569 bp chTERC sequence deposited by Chen et al. 
(2000); the next three hits include Marek’s disease virus (MDV) 
serotype 1 (MDV-1) sequences including the RB1B strain 
(AF331499), the very virulent (vv) Md5 strain (AF243438) 
(Tulman et al., 2000) and the GA strain (AF147806) (Lee et al., 
2000). Five regions of UCD 001 chTERC sequences from the 


5', coding, and 3' regions aligned with the three MDV strain 
accessions, see Fig. 1. The five blocks of homologies include 
nucleotide positions -323 to -261 (TR1), -158 to +127 (TR2), 
+228 to +498 (TR3A), +493 to +563 (TR3B) and +1048 to 
+ 1140 (TR4). Sequence identities ranged from 82% (TR4) to 
95% (TR3A). TR1 aligned with the Md5 strain (85% identity) 
and with the sequence provided in Fig. 1 of Fragnet et al. (2003) 
(RB1B strain) but not the GA strain. TR2, TR3A, TR3B, TR4 
aligned with all three sequences available in GenBank (RB1B, 
Md5, GA), see Fig. 1. TR2 and TR3A include TR coding region; 
notably, missing from the MDV sequences are the chTERC cod¬ 
ing sequences spanning +128 to +227 corresponding to the GC 
rich sequences of the pseudoknot region (coaxial stacking region) 
(Chen et al., 2000). A gap in the TR2 alignment at chTERC 
positions -88 to -75 is found within the GA-MDV-1 sequence. 
In addition to 5' and coding sequence homologies, downstream 
y sequence sets were found to have identity with MDV, the 
TR3A, TR3B and TR4. TR3A and TR3B are in fact contiguous, 
but are noted separately here because sequence homologies are 
not identified when the entire sequence series is queried for 
homology. The 3' chTERC regions with identity in the MDV 
genomes contain more than 10 regulatory element motifs as 
defined by TESS (see Fig. 2). The chTERC alignments with the 
Md5 (AF243438) and GA (AF 147806) strains, both of which 
include sequence information for unique long (UL) and inverted 
repeat long (IRL) sequences of the viral genome, align identically 
in both the terminal repeat long (TRL) and IRL regions (i.e., two 
copies/genome), with chTERC sequences in an inverted orienta¬ 
tion in the IRL. 

Cis-motif identification (5' and 3'). A number of specialized 
motifs with homology to regulatory factor binding sequences 
were identified 5' of the coding sequence (see Table 1). These 
include numerous Sp 1 sites, cell-specific repressors and activa¬ 
tors (e.g., Pit-la), as well as motifs associated with oncogenic 
transcription factors (c-Myb) and steroid receptor binding sites 
(GR, ER). Table 1 compares the 5' regulatory element motifs 
identified in the UCD 001 chTERC sequence to the chTERC 5 7 
region and the MDV (strain RB1B) region recently reported by 
Fragnet et al. (2003), plus human and murine TRs. The 
chTERC clone sequenced by Fragnet et al. (2003) was from a 
DT40 lambda library provided by the Greider laboratory at 
Johns Hopkins University (C. Greider, personal communica¬ 
tion). For each sequence (chicken UCD 001, DT40, MDV, 
human and murine) between 34 and 37 transcription factor 
binding motifs were identified by TESS (these are minimum 
estimates since overlapping motifs within a 1-3 bp start posi¬ 
tion are not indicated); six of the motifs were found in all of the 
genomes (e.g., Spl, GR, CCAAT, ER, PU.l, PR) although their 
number and location were usually, but not always, variable 
between chicken and human/murine sequences. There were 
differences between the UCD 001 and DT40 sequences, in¬ 
cluding three additional Spl sites in DT40 (-423, -133, -48) 
not found in UCD 001 and one site in UCD 001 not found in 
DT40 (-116). Also, UCD 001 possessed additional c-Myb, ER 
and BBF1 sites (one each), whereas DT40 possessed an extra 
MAZ, TBP and c-Ets-1 site (one each, MAZ and TBP overlap¬ 
ping). Each genome had a c-Ets-2 motif although at different 
locations (see Table 1). 
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Fig. 1. Blast sequence homologies between chTERC and MDV include 
local and distal 5' and 3' sequences, as well as the coding region. Sequence of 
UCD 001 chicken telomerase RNA (AY312571) as compared to three strains 
of Marek’s disease virus sequence (GA, Md5, RB1B). Two copies of TERC 
sequences are found in the fully-sequenced MDV genomes (GA, Md5), 
present in a normal and an inverted orientation. The chTERC coding region 
is designated in light gray on the diagram of the chicken sequence and the 
MDV open reading frames (R-LORF-1, MDV001, 079, 080, telomerase) are 
designated by black lines above the corresponding sequence. Sequences 
denoted by boxed regions (see key) with chicken TR notations (TR1, TR2, 
TR3A/3B, TR4) correspond to sequences found within various oncogenic 
Marek’s disease virus sequences; percent homologies indicated above the 


coded boxes. Although not shown here, MDV-1 RB IB contains an 84 % TR1 
homology with chicken TERC when the promoter sequence from Fragnet et 
al. (2003) is used for comparison (as yet this sequence in not available in 
NCBI for Blast searches). A gap in the TR2 alignment is found within the 
MDV-GA TR sequences (between 138-174 and between 138,502-138,538) 
A portion of the coding region is missing from the MDV TERC sequence(s), 
positions +128 to +227 including the GC-rich coaxial stacking region within 
the pseudoknot region of the conserved secondary structure (see Chen et al., 
2000). Notably, the 3' distal homology blocks (TR3B and TR4) contain 
numerous transcription factor motifs including c-Myb sites (see Fig. 2.) The 
local 5' block (TR2) includes the proximal promoter sequence and the more 
distal 5 7 block (TR1) includes a c-Myb motif. 


A TESS screen for transcription binding factor motifs 
located 3' of the gene resulted in over 200 motifs identified 
from positions +466 to +3159. Motifs known to be involved in 
the regulation of TERT (see Mergny et al., 2002) are indicated 
in Fig. 2 (n = 60). The frequency distribution of sites decreased 
with distance from the coding region with two-thirds of sites 
located within the first 800 bp of the downstream region (see 
Fig. 2) and only three motifs identified within the last 800 bp of 
sequence. Predominant factor sites included: Spl (18, some 
overlapping with other motifs), c-Myb (seven), ER (six), AP2 
(five), PR (six), and c-Myc (two). A 71-bp mammalian inter¬ 
spersed repeat (MIR) sequence was identified at positions 
+2311 to +2381. 

TR-positive BAC characterization 

Of the five BAC clones identified as positive during the ini¬ 
tial BAC library filter screening with TRpl 1-26 (-96 to +1162) 


only one clone (TR-BAC29), was positive upon re-screening. 
Restriction fragment analysis of the BAC clone and subsequent 
hybridization of Southern blots to the TRp5-6 (+72 to +402 
coding sequence only) yielded a single intense band for each 
enzyme as well as some secondary, lighter bands. The single, 
intense fragments were of predicted size based on our sequenc¬ 
ing data of the upstream and downstream regions of chTERC, 
with the exception of a BamHl fragment, whose size could not 
be predicted due to a lack of a second restriction site within the 
available sequence data. Further, a 330-bp fragment of the 
chTERC coding region was PCR amplified from an isolated 
and purified 1.45-kb TR-BAC29 Rsal restriction fragment as 
well as from the undigested BAC. These amplicons were 
sequenced-verified and produced alignments identical to the 
chTERC coding sequence; thus it was clear that the identified 
BAC contained the chTERC gene. 
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-Myb 


J3Ralpha 


+466 CGGCCCCGCG CGCACGACCG TTGGAGCCGT TGGAGACGTT GGGCCGCGCG CGGGGCGCCG TGAGGAGACC CACGGGGCC G^ CCCCG CTGAG GTGGGCGGGC GGTCGGTCGG 

IVJyoD Spl Sp\ CPI/f:TI■/Spl Spl CPt/ETF/SpI 
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T3galpha* lk-2^ UR/Spl c-Myt^ UR/Spl Spt , 

+726 GTGAACACGA GGGAACACAC GAGGGAGCAC GCGAGCGCCC GAAGGGGAAT ATTTGGTGAA GGTGTGGACG AGCGGACGGT GAGGGATCCG CT^CGGGGCTG CTGGGGCCTT T7CTCGCTCC GTACGC7GAG 


CGGGG7GAG AAGCGCAGAG 


+856 CCGxAGGCCCG GCCGTGTCCC AGAGCAGAGC AGAGCCGTCC TC-GCC-CAG CACCGCCGCC ACACAGCGCG GCCCGCGTGC CGGCCGTGCA 

Spl g^/E TRSp 1 Ap2 ER/Spl 

+986 CGAGCAGCGC GATGCCTCCG 7GCCGCCACC CGCCCCGCTG CGCCCCGTGC ATGTGGCCGC TGCCATCTGCT TGGGCTCGGC TTTCTTTATA CCCCGTTTCG CACGAAGGAT CAGCGAACTT GGGTACTTAA 


GGACAGAGCG CGCACAAAAG 

-Myb 


OGCCC 

UR/Spl 




AP2 

GCGGCCCTGG GGATGGGCTG 

c-My| 


c-Mvc 


+ 1116 TTATGTATTA ATTGGGCGAT GGAAAC7CAT TGCCGGAACA CGIGGgAATG GCGAAGTCCA ACACGCGTGG AGATGTGCCG GGCCCGCCGG TGGAGTCCGA GCAGCACT7A 
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c-Myb 

+1506 CCTACCTCCC TATGCATGAA AAACAGGAGG TGTGCTCTGC GTGTT1GTG7 ACCGGAGAAT AAATG7TTGG AAACTCAGAA CTTCGTCTAC T7TCATTACT GTTTGTCTCT GGTAACGAGG AAAATC7CAT 
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+1636 GTAACAGTGT TGC7GTGAAG GCCC-CCACAA ATGTAGCTGA ATCTGCCCCC TACTGCTTCT GAGCCAGGAA CTGAGCCAGG ATTTGAAC7C CACGGTCCTT ATGGGTCCCT 
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+1766 GTTCTATGAA AGGAAATGCA 7CTGATGGCT CTCCTATGAG CAGAACT’CA" CCAACAGCAA AAGCAGCTCC TGGAGATGCA GAGCCGTGCC AGTGCCGTGC '7TGGACCACA 

4 R 

+1896 TCCCTGCAAA AC G 7 A AG 7 G A ICCGAT'IAAT ATATCCAGAC TTGTGT 7 TG'7 TTATTTGTTT GCTTTMAAA'T CTAATGGTTT GTTTGAAGCT TGTATGTGCT GAGT7TACCA 




TCAGCTCAAG GTAGTC7ACA 

«£R 

TGGCAT^CAG CAGATG7TCT 
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+2026 ATGCAGAGAC ATGATGAT’GT CCGGGAGGAG TGCTTGTGCT CCATACATTG TGAACGTGCC TTATTGAAGC TCATGCTCTG TGCTGCCAT TGA _ GANCCT GGGCACTGGG G _ TGTC r 7TGG CT _ ACC _ GAT 


PR. c.-Et^2 


Ik-2 


TGGCATC'TTA AGGCATTCGG A _ GACT r 7TTG TG _ TATCAGG 


+2156 TGTTGGGTGT CACT’GTCT’GG AAC7CAGCAC AGAGAAGTTG GGT'TGTT'CTC TAAGGAAGGC ACAGA7GGG G AATTTTGA^ G GGTGCCTTG 

Nl'kappafi* 

+2286 TTGATTATGT GCTAAGGGTG GTGA TGGGCT AAGTGCTTTG GCCAAAGGTG ACAAAGCAAT TTCTG7CA7G CTGGTGTTAA GAAGTGGTGT CACCAT ACTG AAATGTGCAT AGAGCTGCAT TCCTGC7GCN 
+2416 AAGCCNTGTG TACCGTG7GT GC77TT7CAA AATACCCAAG AGGGCT7TTG TTNTTTGTGG CTTTGGGAAA TTGCTCCTTT TCTGTCCAGA ACC7AAAAGC AGTAGCTCC-G TC-TGC-G7TAA ATCCCCGAGG 


+2546 AAGGGATAAG GTGCCACAGT ACAATGGTAT GGATTCACAT GTGGTACAAC TTCGTGTGTG AACTCAAAA7 AATCTTTGCT ACACTTTT7T T7T7TTCCAT ACGAACAGCA GAAAGGCTGA AT7TCTATTG 

c-Jun 


+2676 TTACCAAAAA TCC7GCTGGT 7A77TCCAGT AGTTGTGAAT GA7GGCAGTG TGTGTAGAGC TGTTAAGAAG AGTAAAGTGG AGTTAACACT GAGCTAC^GC ACTGAGGC7A ACACAACAGT TCAAGC 



ER 1 


^ c-Myc 


+2806 CTTGCTTAAo AGACAACAAA AAC7GTCTTA AGGAGAAATA ACAATTATCA AAGACCAAAG AATGG7TTAA TTCAAAGGCT GTTTAACC7C TGACAAA7CT GGCAAACA _ T AATAC-AACCT ACCCTG _ TCT 


+2936 GTTTGCAGCT TGCCTCAACA AGGAGCACTG TCAGCATGTG GT7GGG7GT7 GCATTAGGAG TGCTTATCAG TGGCAGGAGC ACACGTCTCA GCCACCTCAA AGAAACATAA TGGCACCATG AAGTAAAGCA 


+3066 GTCTAAT’CTG TG7ATACGCT GAGAACGTGG GCCAAAGAAG GGTTAAGCAC AGAATGAATT ^CTG^GCTGT TCTTTACACT GTTCGTAGCA AT 


Fig. 2. Chicken telomerase RNA downstream (30 region sequence 
includes numerous transcription binding factor motifs. Over 200 motifs were 
identified by TESS of which 60 are indicated here corresponding to binding 
factor motifs known to be associated with TERT gene regulation (see Mergny 
et ah, 2002). The motifs are shown by arrows (normal and reverse orienta¬ 
tion). Each motif is labeled with its corresponding factor name, both above 
(bold) and below (italic) the associated sequence due to space constraints 


given overlapping and/or numerous motifs for certain regions. Dotted lines 
indicate a motif that extends onto the next line. The sequences in bold type 
(+466 to +563, +1048 to +1140) have high homology to MDV-encoded TR 
sequences (TR3A/B and TR4, see Fig. 1 and Results). Underlined sequence 
indicates homology with a mammalian MIR sequence. Asterisks mark 
motifs that were not found within our threshold (stricter parameters) but 
were above the standard annotated TESS threshold. 


Chromosomal location ofTR 

The TR-BAC FISH signal from interphase cells consisted of 
two distinct spots, indicating a single locus. Similarly, two fluo¬ 
rescent signals on metaphase chromosomes were observed; 
these were found on a smaller acrocentric chromosome pair 
(one of the larger microchromosomes) positioned telomerically 
on the q-arm (Fig. 3A). The chromosomes exhibited a size, 
morphology, and DAPI-staining pattern consistent with GGA9 
(Daniels and Delany, 2003). Two-color dual FISH experiments 
using TR-BAC29 and 5S rDNA probes produced co-localized 
signals on GGA9 (Fig. 3B and C). 

Discussion 

Our study of the chicken telomerase RNA gene emphasized 
various aspects of the comparative genomics spectrum, from 
molecular (sequences and motifs) to cytogenetic (mapping and 


synteny groups). The UCD 001 Red Jungle Fowl (G. g. gallus) 
sequence reported here provides the longest sequence available 
to date but is in fact the third chicken TERC gene region to be 
sequenced. The diversity of chTERC sequences from different 
genotypes allows for regulatory element comparisons among 
promoter sequences of not only chicken, but also viral and 
mammalian (see Table 1). The initial sequence published in¬ 
cluded 105 bp of promoter plus the coding region (465 bp) as 
derived from PCR-amplified New Hampshire (NH) chicken 
DNA (C. Greider, personal communication; DNA purchased 
from Clontech; Chen et al., 2000). The NH is an American 
breed of chicken developed as a dual purpose type of bird (se¬ 
lected for meat and egg traits). No sequence differences were 
found between UCD 001 and the NH genotypes. More recent¬ 
ly, Fragnet et al. (2003) sequenced 770 bp 5 7 of chTERC from a 
DT40 library clone (provided by C. Greider, personal commu¬ 
nication, the DT40 clone was originally identified as containing 
chTERC by Chen et al., 2000). DT40 is an avian leucosis virus 
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Table 1 . Transcription factor binding motif locations and numbers for TERC promoter regions among genomes: chicken (UCD 001, DT40), viral (MDV), 
human and mouse. 


Factor 3 

UCD 00 l b 

DT40 b 

MDV b 

Human b 

Mouse b 


No. 

Start position 

No. 

Start position 

No. 

Start position 

No. 

Start position 

No. 

Start position 

Spl 

9 

-575, -527, -232 c , -203 c , 

11 

-575, -527, -423, -231 c , 

10 

-454 c , -259 c , -190 c , 

3 

-463, -105 c , -38 d 

3 

-590, -383, -323 



-167,-116, -85, -52 c , -42 


-202 c , -166,-133,-85, 


-172 c , -165,-125,-101, 









-51,-48, -42 


-51,-48, -42 





GR 

3 

-598 c , -454, -336 

3 

-598 c , -453, -334 

4 

-544, -531,-382, -293 

4 

-377, -453, -340 c , -333 

4 

-582,-505,-153,-63 

ZF5 

3 

-352, -226,-214 

3 

-350, -224, -213 

1 

-58 

0 


1 

-7 

c-Myb 

2 

-396, -316 

1 

-314 

2 

-386, -282 

4 

-590, -586, -378, -134 

0 


CCAAT 

2 

-95, -65 

2 

-95,-65 

2 

-95,-65 

1 

-58 

2 

-244,-17 e 

ER 

2 

-167,-145 

1 

-166 f 

1 

-257 f 

1 

-104 f 

1 

-511 

NIP 

1 

-540 

1 

-540 

0 


0 


0 


CACCC 

1 

-527 

1 

-527 

1 

-453 

0 


1 

-323 

Pit-la 

1 

-489 

1 

-489 

2 

-508, -157 

1 

-303 

0 


BBF1 

1 

-478 

0 


0 


1 

-132 

0 


PU.l 

1 

-455 

1 

-454 

1 

-37 

3 

-495, -486, -154 

1 

-583 

PEA3 

1 

-372 

1 

-370 

0 


2 

-575, -472 

0 


AML1 

1 

-242 

1 

-240 

1 

-346 

0 


1 

-312 

GCF 

1 

-231 

1 

-230 

0 


0 


0 


ETF 

1 

-209 

1 

-208 

0 


0 


3 

-124,-113,-109 

c-Ets-2 

1 

-568 

1 

-236 

2 

-362,-196 

1 

-313 

0 


PR 

1 

-597 

1 

-597 f 

2 

-382 f , -293 f 

1 

-338 

1 

-63 

MAZ 

1 

-202 f 

2 

-23 l f , -202 f 

1 

-172 f 

0 


0 


WTl-Kts 

1 

-46 f 

1 

-46 f 

1 

-46 f 

0 


0 


TBP 

1 

-202 f 

2 

-23 l f , -202 f 

0 


1 

-29 

7 

-562,-265,-122,-118, 

-115,-111,-107 

SRY beta 

1 

-351 f 

1 

-349 f 

0 


0 


1 

-193 f 

MCBF 

0 


0 


0 


1 

-347 

0 


c-Ets-1 

0 


1 

-464 

0 


1 

-473 

2 

-73, -53 

NF1 

0 


0 


1 

-253 

2 

-576, -183 

1 

-89 

SRY 

0 


0 


0 


2 

-580, -382 

1 

-521 

c-Jun 

0 


0 


1 

-524 f 

3 

-520 f , -89 f , -70 f 

0 


API 

0 


0 


2 

-524, -497 

3 

-224, -90, -70 

2 

-454,-17 

AP2 

0 


0 


0 


0 


1 

-29 

delta 

0 


0 


0 


2 

-300, -139 

1 

-58 

c-Myc 

0 


0 


2 

-386 f , -18l f 

0 


0 



Bold text represents those motifs shown to have function in human TERT (Mergny et al., 2002). Boxes shaded in gray represent those motifs which have been identified 
previously as functional for human TERC (Zhao et al., 2003). 

b Chicken UCD 001: AY312571 coding as reported here; chicken DT40 and MDV RB1B (Fragnet et al., 2003); human AF047386; mouse AF047387. All comparisons were 

made using 604 bp of sequence immediately 5' to the coding region for TERC. Bold start site values are those for which the motif is found in both the normal and the reverse 
directions. 

c Multiple motif hits within 1-3 bp of the start site (n = 2-4). Only a single site is reported and therefore, the number of sites is a minimum estimate. 

d Site not identified by TESS but has been found experimentally to be a functional binding site (Zhao et al., 2003). 

e Motif called CPI by TESS but has the same binding motif sequence as the CCAAT-binding factor in the other species. 

f Motif did not meet thresholds set here but is above the threshold standard for TESS annotation analysis. Our threshold parameters were as follows: L a > 12, La/ = 2.0, Lq = 

1.0, L d — 0. 


(ALV) transformed B cell line derived from an SC (Hy-line pro¬ 
duction strain, white egg type) female chicken (Baba and 
Humphries, 1984; Baba et al., 1985). Binding factor motif anal¬ 
ysis indicated that compared to UCD 001, the DT40 5 7 
chTERC region possesses additional c-Ets-1 (transcriptional 
activator protein identified in HTLV-1), TBP (TATA-box 
binding protein), and MAZ (zinc finger protein) motifs, plus 
two additional Spl motifs and lacks a BBF1 (cardiac lineage 
transcription factor), an ER (estrogen receptor), and a c-Myb 
motif (motif position changes were also identified). 

As telomerase activity and age-related/division-dependent 
telomere shortening profiles in vivo and in vitro are similar in 
chicken and human (Delany et al., 2000; Taylor and Delany 
2000; Swanberg and Delany, 2003) homologies might be ex¬ 
pected for the regulatory mechanisms for genes associated with 
the telomere clock. In Fig. 5 we show a comparison between 
human and chicken TERC 5 7 regions to assess the similarities 


and differences in type, number and location of motifs found to 
be present in two or more copies in chicken. Interestingly, 
excepting ZF5 (present in chicken only) the same motifs are 
found in both TERC promoter regions (Spl, GR, c-Myb, ER, 
CCAAT) although chicken has a greater number of each motif 
(excepting GR, c-Myb) than human. Figure 5 also draws atten¬ 
tion to the relative positions of the motifs wherein for several 
motifs their positions along the upstream region of the gene 
were the same or within close proximity in both genomes (with¬ 
in 50 bp). 

A particularly striking feature of the chTERC sequence is 
the number of Spl sites surrounding the gene, both 5 7 and 3 7 . 
These GC-rich boxes are known to be transcription initiator 
elements (Suske, 1999). As for TERC regulation, different Spl 
sites local to human TERC have been shown to have both 
repressor and activator function (Zhao et al., 2003). The chick¬ 
en 5 7 TERC regions (UCD 001 and DT40) possess more Spl 
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Fig. 3. The telomerase RNA gene maps to chicken chromosome 9 
(GGA9). The TERC signal is green and the 5S rDNA is red in all images. (A) 
TR-BAC29 hybridization to chicken metaphase chromosomes showing one 
locus on a larger microchromosome pair. (B) A partial metaphase spread with 
the dual FISH results using TR-BAC29 and 5S rDNA (arrows indicate 
GGA9); both the DAPI (left panel) and corresponding dual red/green image 
(right panel) are shown. (C) A tetraploid metaphase cell showing co-localiza¬ 
tion of signals on four chromosomes. The DAPI-stained GGA9’s are indi¬ 
cated by arrows (left panel); the corresponding FISH image (right panel) 
shows co-localized red/green signals on all four chromosomes. Insets show 
enlarged chromosomes to illustrate various hybridization patterns. The 
TERC signal appeared at a very telomeric region of the q arm of GGA9. 


sites than human or mouse (nine sites in UCD 001, 11 in DT40 
and only three each in human and mouse, notably MDV has 
ten sites local to the telomerase gene). Since the chicken 
genome is particularly GC-rich relative to mammalian ge¬ 
nomes, it may be coincidentally high in Sp 1 sites. Alternatively, 
the chicken genome may preferentially employ Spl motifs for 
gene regulation. Here we report on the high GC content of the 
TERC region sequence (5', coding and 3' regions are 67 %, 77 % 


and 52% GC, respectively). A large portion of the chTERC 
sequence, extending both upstream and downstream of the 
gene (-556 to +1436) fits the criteria for a CpG island. The 
function proposed for CpG islands is to protect highly ex¬ 
pressed genes from epigenetic modification (e.g., methylation). 
The methylation status for chTERC is unknown (CpG islands 
by definition are non-methylated). 

In addition to the 5' region, we report extensive 3' end 
sequence data with numerous transcription binding motifs 
identified (over 200) including a large number of binding sites 
for oncogene proteins (e.g., c-Myb, c-Myc, c-Jun, see Fig. 2). 
Altered cellular and viral oncogene expression via mutation 
and/or deregulated expression (linked to a wide variety of 
human and also chicken cancers) may have the potential to 
affect TERC expression through these local and distal upstream 
and downstream binding motifs. Numerous regulatory element 
motifs (n = 60) are found at the 3' end of chTERC that are also 
involved in TERT regulation; specifically, both Spl and c-Myc 
sites 3' of human TERT are involved in regulation (Mergny et 
ah, 2002). Sharing of regulatory elements responsive to the 
same transcription factors (e.g., Myc protein) could be a mecha¬ 
nism for controlling both TERC and TERT expression. In this 
regard, Falchetti et al. (1999) demonstrated that retroviral 
infection and subsequent expression of v-Myc protein resulted 
in the emergence of telomerase activity in quail myoblasts and 
chicken neuroretina cells, previously telomerase negative. 

The existence of TERC sequence in an MDV genome was 
first noted in a 2000 GenBank sequence (AF331499) of the 
oncogenic RB1B strain. Recently, Fragnet et al. (2003) pro¬ 
vided evidence for expression of the viral version of TR in 
peripheral blood of infected birds and further described local 
promoter and coding region homology between chicken and 
oncogenic MDV genomes. In this study we report additional 
and slightly different sequence homology. The 3' homologous 
regions, TR3B and TR4 are local (i.e., adjacent to the gene 
including another 100 bp) and distal (almost 500 bp away), 
respectively. Our BLAST results split the 5' promoter region 
homology (reported as 382 bp with 7 3 % homology, see Fig. 1 of 
Fragnet et al., 2003) into two blocks of 85% (TR1) and 91% 
(TR2) homology with no sequence homology in between (see 
Fig. 1, this study). Both the 5' and 3' regions of homology con¬ 
tain c-Myb motifs, numerous Spl sites as well as other motifs 
(see Table 1 and Fig. 2). 

Evidence exists for TERC upregulation in human cancers 
(Cong et al., 2002; Mergny et al., 2002). The enhanced expres¬ 
sion of TERC occurs via chromosomally-mediated dosage 
mechanisms including gene amplification, isochromosome for¬ 
mation and polysomy (Avilion et al., 1996; Soder et al., 1997, 
1998). Such dysregulation of TERC (a constitutively expressed 
gene) has been suggested to be an early event during oncogene¬ 
sis contributing to cellular immortalization (Naito et al., 2001). 
One hypothesis regarding possession of the TERC gene se¬ 
quence^) by MDV is that it promotes upregulation/dysregula- 
tion, perhaps during the infectious stage. The MDV genomes 
that have been sequenced in their entirety (GA, Md5) contain 
two copies of chicken TERC (see Fig. 1); the presence of 
numerous copies of the viral genome (either episomal or inte¬ 
grated) in an infected cell could provide an increased dosage 
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Fig. 4. Conservation of synteny between chicken chromosome 9 
(GGA9q) loci, human (HSA) and mouse (MMU) chromosomes. The 
GGA9q loci map to three human chromosomes (HSA1,2, and 3) with five of 
the nine loci mapping to the q-arm of HSA3; GGA9q loci map to four mouse 
chromosomes (MMU1, 3, 8, and 16). Chicken map adapted from Groenen et 
al., (2000), Schmid et al., (2000), Daniels and Delany (2003), and http:// 
www.arkdb.org; the mouse and human map information from http:// 
www.informatics.jax.org and http://ncbi.nlm.nih.gov/LocusLink. For each 
locus, information in parentheses indicates cM position (#) or cytogenetic 


location (i.e., proximal (prox), terminal (ter), centromere (cen), and G-band 
locations for human and mouse). The cM positions for chicken vary accord¬ 
ing to the linkage map examined (e.g., East Lansing, Compton), values here 
are from the consensus map midpoint (Groenen et al., 2000; linkage data for 
RN5S from Daniels and Delany, 2003). = locus not mapped, RN5S = 5S 

rDNA, TFRC = transferrin receptor, EIF4A2 = eukaryotic translation initia¬ 
tion factor 4A2, NCL = nucleolin, PAX3 = paired box 3, SKIL = ski-novel 
protein overexpressed (also known as SNON), MCW0134 = microsatellite, 
TR = telomerase RNA (also known as TERC). 
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Fig. 5. Comparison between chicken (UCD 001) and human transcription factor binding motifs located 5' of the telomerase 
RNA gene. Approximate locations of motifs (those present in two or more copies within the 604 bp 5' sequence) are indicated for 
the chicken sequence (AY312571). Comparative human sequence (AF047386) is arranged below the chicken sequence. Overlap¬ 
ping motifs are stacked. Dashed lines connect the motifs that are located within a 50 bp region for both species. Parenthetical values 
within the boxed key indicate the number of sites in chicken and human, respectively. 


(amplification) opportunity similar to that described for the 
human cancers (Soder et al., 1997). 

Mapping of chTERC to GGA9 has led to further elabora¬ 
tion of conserved synteny among higher vertebrate genomes 
(see Fig. 4). As has been noted by others (Suchyta et al, 2001; 
Burt, 2002) the chicken genome holds many blocks of genes in 
common with human, although gene order is often disrupted. 
Here we report that of nine GGA9q loci, five map to HSA3q 
(including the telomerase RNA gene), two of the loci map to 
HSA2q and one (5S rDNA) to HSAlq. In regard to the mouse 
karyotype, the loci encoded by GGA9q are distributed among 
four chromosomes. Conservation of synteny between the hu¬ 
man and chicken genomes promotes the identification of can¬ 
didate genes in chicken based on their human chromosome 
locations; such information will become even more valuable as 
the chicken genome (UCD 001) draft and annotated sequence 
become available (http://www.nhgri.nih.gov/DER/Sequencing/ 
proposals.html). 

Equally important to establishing coding sequences is the 
elaboration of upstream and downstream sequences and their 


functionality in controlling gene expression in a temporal and 
tissue-specific manner. Differences in regulatory element mo¬ 
tifs of different chicken genotypes as shown here may influence 
aspects of TR regulation during growth and development as 
well as provide the opportunity for dysregulation during trans¬ 
formation, especially during viral diseases involving oncogenes 
given the number of c-Myb, c-Myc, c-Jun binding motifs sur¬ 
rounding chicken TR. Such genotype-based differences in 5 7 
and y control regions may also contribute to the genetic bases 
for resistance and susceptibility to diseases in addition to the 
traditionally considered role of coding region allelic variation. 
The extent of the sequence now available for chicken TR 
should lead to new experimental analysis of the functional role 
for TR during transformation. 
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Abstract. Although vertebrate telomeres are highly con¬ 
served, telomere dynamics and telomerase profiles vary among 
species. The objective of the present study was to examine tel¬ 
omerase activity and telomere length profiles of transformed 
and non-transformed avian cells in vitro. Non-transformed 
chicken embryo fibroblasts (CEFs) showed little or no telomer¬ 
ase activity from the earliest passages through senescence. 
Unexpectedly, a single culture of particularly long-lived senes¬ 
cent CEFs showed telomerase activity after over 250 days in 
culture. Transformed avian lines (six chicken, two quail and 
one turkey) and tumor samples (two chicken) exhibited telom¬ 
erase activity. Telomere length profiles of non-transformed 
CEF cultures derived from individual embryos of an inbred 
line (UCD 003) exhibited cycles of shortening and lengthening 


Telomere shortening is believed to occur due to regulation 
of telomerase, one of its components or accessory proteins (For¬ 
syth et al., 2002; Karlseder et al., 2002; Chan et al., 2003). 
Shortening of telomeres can result in a loss of growth potential 
characteristic of replicative senescence, an increase in genome 
instability and cell death by induction of the DNA-damage 
response and apoptosis pathways. The length of the shortest 
telomere may trigger telomere dysfunction and loss of growth 
potential (Hemann et al., 2001a, 2001b). In this model, short 
telomeres are recognized as damaged, signaling a G 2 /M cell 
cycle arrest affording the cell time to repair the damage. If the 
damage is not repaired, a checkpoint response results in further 
cell cycle arrest or apoptosis (Fee et al., 1998; Hemann et al., 
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with a substantial net loss of telomeric DNA by senescence. 
The telomere length profiles of several transformed cell lines 
resembled telomere length profiles of senescent CEFs in that 
they exhibited little of the typical smear of terminal restriction 
fragments (TRFs) suggesting that these transformed cells may 
possess a reduced amount of telomeric DNA. These results 
show that avian telomerase activity profiles are consistent with 
the telomerase activity profiles of human primary and trans¬ 
formed cells. Further, monitoring of telomere lengths of prima¬ 
ry cells provides evidence for a dynamic series of changes over 
the lifespan of any specific cell culture ultimately resulting in 
net telomeric DNA loss by senescence. 

Copyright©2003 S. Karger AG, Basel 


2001b). Induction of the DNA-damage response by telomere 
shortening may be a protective genetic mechanism that pre¬ 
vents the proliferation of abnormal, aging cell lineages. Alterna¬ 
tively, senescence may be induced not by shortening of telom¬ 
eres per se but by loss of the protective effect of accessory pro¬ 
teins or telomerase on capped telomeres (Karlseder et al., 2002; 
Chan et al., 2003). Regardless of the mechanism involved, a 
hallmark of tumorigenesis is the re-emergence of telomerase 
activity which enables tumor cells to evade DNA-damage path¬ 
ways. Persistence of telomerase in immortalized cells may pre¬ 
vent apoptosis by stabilizing shortened telomeres. Reactivation 
of telomerase appears to induce resistance to apoptosis (Hahn 
et al., 1999; Herbert et al., 1999; Holt et al., 1999). 

Although the vertebrate telomere repeat sequence is highly 
conserved, telomere organization, telomere dynamics and tel¬ 
omerase profiles vary among species (see Delany et al., 2000; 
Forsyth et al., 2002; and Delany et al., 2003 for review). Gallus 
gallus domesticus , the domestic chicken, has a long history as a 
model organism in developmental biology; is a significant 
resource for human vaccine production and research; and is 
also a globally important food-animal species. 
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Telomere array abundance, size, and location have been 
examined in the chicken genome (2n = 78, diploid size of 
2.5 pg). Chicken telomere arrays range from 0.5 to 2 Mb and 
have been classified based upon size, chromosome location, 
Southern blot banding pattern, and age-related shortening. 
Class I arrays range from 0.5 to 8-10 kb, exhibit a Southern blot 
pattern of discrete bands and do not shorten in a division- 
dependent manner. These arrays resist digestion by BaB 1 exo¬ 
nuclease suggesting an interstitial location. Class II arrays range 
from 8-10 kb to 35-40 kb, exhibit a smeared Southern blot 
pattern of overlapping telomere fragments and demonstrate 
division-dependent shortening. Class III arrays range in size 
from 40 kb to ~ 2 Mb, are rapidly digested by BaB 1 indicating 
a terminal location, exhibit hypervariable patterns of discrete 
bands when Southern blotted (even between individuals within 
a highly inbred line) and are the longest telomere arrays 
reported to date for any vertebrate (Delany et al., 2000, 2003; 
Taylor and Delany, 2000). 

The objectives of the present study were to examine telom- 
erase activity profiles and telomere length dynamics in trans¬ 
formed and non-transformed avian cells in vitro within the con¬ 
text of the unusual features of the avian genome. The results 
indicate that the telomeres of chicken embryo fibroblasts in 
vitro undergo a dynamic series of events as evidenced by mea¬ 
surement of shorter and longer mean telomere restriction frag¬ 
ments over the life span of the cultures followed by a precipi¬ 
tous erosion of telomeric DNA at senescence. This dramatic 
erosion of telomeric DNA may be attributable to some as yet 
unknown active mechanism rather than passive attrition of tel¬ 
omeric sequence due to incomplete end replication due to a 
lack of telomerase. The telomere profiles of telomerase-positive 
transformed avian cell lines examined here may provide evi¬ 
dence of catastrophic pre-transformation erosion of telomeres. 

Our findings, which are consistent with studies of human 
cells in vitro wherein telomerase activity is absent from normal 
fibroblasts (which show a net loss of telomeric DNA over the 
lifetime of the culture) and present in transformed cell popula¬ 
tions, provide evidence that chickens and other avian species 
possess telomere clock mechanisms and, in spite of the unique 
features of its genome, establish that the chicken is an appro¬ 
priate and biologically relevant system for studies of human 
replicative senescence. 

Materials and methods 

Cell culture 

Chicken embryo fibroblasts (CEFs) were purchased from the American 
Type Culture Collection (ATCC) or isolated from Ell embryos to create 
pooled-embryo or individual-embryo cell cultures (Lima and Macieira- 
Coehlo, 1972). For the pooled cell culture (CEF2), Ell commercial layer- 
type embryos (n = 12) were utilized. For individual-embryo cell cultures 
(CEF3 1-6), six El 1 embryos (UCD 003 line; Pisenti et al., 2001) were uti¬ 
lized. Cell cultures maintained in DMEM with L-glutamine, 10% FBS, and 
5 % penicillin-streptomycin were split 1:3 or 1:4 until senescence. Senescence 
was determined by growth dynamics, cellular morphology and, in the case of 
CEF3 cultures, by a (3-galactosidase assay (Dimri et al., 1995). Population 
doubling was determined for each passage of CEF3 cultures using the follow¬ 
ing equation: 

Population doubling = [log N t - log N]/ log 2 


with A the number of cells seeded and N t the number of viable cells at the end 
of the passage. (Patterson, 1979; Venkatesan and Price, 1998). Senescence 
staining was performed using the Senescence (3-galactosidase Staining Kit 
(Cell Signaling Technology). Cultures were deemed senescent when > 90 % of 
the cells were positive for (3-galactosidase. Cultures were maintained until no 
further samples could be extracted. 

Transformed cells 

Transformed cells were acquired from a variety of sources. DT-40, LMH, 
and QT6 cells were obtained from the ATCC. DT-40’s were also acquired 
from Dr. Jean-Marie Buerstedde at the Department of Cellular Immunology, 
Heinrich-Pette-Institute, Germany. LMH/2A, RP19, MSB1 and RBIB- 
infected spleen and thymus tumor cells were donated by Dr. Carol Cardona 
of the University of California School of Veterinary Medicine, Davis CA. 
MQ-NCSU cells were acquired from Dr. Muquarrab Qureshi of the Depart¬ 
ment of Poultry Science, North Carolina State University, Raleigh NC. RP-9 
cells were provided by Dr. Henry Hunt, USDA-ARS-ADOL, Michigan via 
Dr. Marcia Miller, Beckman Research Institute, Duarte CA. See Table 1 for a 
description of these cells and cell lines. 

DNA Isolation and analysis of Terminal Restriction Fragments (TRF) 

Genomic DNA was extracted from CEFs spanning the earliest passages 
through senescence. DNA samples were isolated and purified using the 
AquaPure Genomic DNA Isolation Kit (BIORAD). Purified DNA samples 
were digested overnight with Haelll and quantified by fluorometry (Molecu¬ 
lar Dynamics Fluorimager 595). 50 ng of DNA per lane were separated by 
electrophoresis in 0.6% agarose gels for 4 h at 55 volts. Using this protocol, 
high molecular weight Class III telomeric DNA is retained (i.e., does not 
migrate) and only Class II fragments are analyzed. Mean telomere length and 
percent telomeric DNA were determined for all lanes of each gel. To examine 
telomere shortening in a typical telomere restriction fragment smear, molec¬ 
ular weight markers were run on each gel. Prior to hybridization, each gel was 
stained with ethidium bromide and photographed. The gels were destained, 
Southern-blotted and hybridized with a 32 P-labeled TTAGGG( 7 ) probe as 
previously described (Taylor and Delany, 2000). Blots were exposed to 
Kodak BioMax MR film and the resulting autoradiographs were compared 
to the gel photographs. Molecular weight markers, determined with reference 
to the gel photographs, were noted on the autoradiographs. Autoradiographs 
were scanned and analyzed with Kodak ID image analysis software version 
3.6. Mean telomere length was defined as F(ODj x Li)/(F ODj) (Taylor and 
Delany, 2000; Ramirez et al., 2003) with ODj the net intensity (intensity - 
background) of the DNA at a given position on the gel and Lj the DNA length 
at that same position as measured by the image analysis software. ODi and Lj 
measurements were made at 12 points along each lane of a typical blot. To 
supplement mean TRF analysis, total telomeric DNA, consisting of the total 
integrated signal (F ODi) over the same range of fragment sizes used for mean 
TRF analysis, was determined for each lane by densitometry. Integrated sig¬ 
nals from each lane were expressed as a percentage of the signal from early 
passage DNA as previously described (Harley et al., 1990). 

Preparation of cell extracts and telomerase assays 

Telomerase activity was assayed using the Telomeric Repeat Amplifica¬ 
tion Protocol (TRAP) (Kim et al., 1994) in which telomerase adds telomeric 
repeats to a synthetic oligonucleotide primer followed by PCR amplification. 
Cell extracts were prepared and analyzed using the TRAPeze Telomerase 
Detection Kit (Intergen). Two micrograms of protein were used in each 
TRAP assay, with protein concentration determined by Bradford assay. 


Results 

Non-transformed cells 

Telomerase profiles of chicken cells in vitro. Typically, the 
CEF cultures derived from pooled or individual El 1 embryos 
showed no telomerase activity from the earliest passages 
through senescence (see Table 1 and Fig. 1A). Exceptions in¬ 
cluded low activity detected in CEF3 cultures 3 and 4 (the first 
50 bp band of the telomerase oligonucleotide ladder was faintly 
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Fig. 1. Telomerase activity is lacking in primary CEFs and present in 
transformed avian cell lines and tumors. (A) Representative TRAP assay 
results showing lack of telomerase activity in the CEF2 culture in pas¬ 
sages 0-15, senescent CEF2 cells (S) and in ATCC CEFs (SL-29). 
(B) Positive TRAP assay results of CEF2 cells >250 days in culture. 


(C) TRAP assays of transformed avian cell lines and tumors. Lane 12: trans¬ 
formed human embryonic kidney cells (positive control). Lane 13: chicken 
neurula (positive control). Negative controls not shown. See Table 1 for 
details on cell cultures and tumors. 


Table 1. Telomerase profiles of avian cells 
and cell lines 


Designation 3 

Telomerase 

Description 15 

Agent of transformation 0 

Non-transformed 

SL-29 

- 

ATCC, CEF, PD 18 


CEF2 

- 

Po-Pis 


CEF2 d 

+ 

> 250 days in culture 


CEF3-1 

PD 23 only 

PD, - PD 23 


CEF3-2 

PD 4 only 

PD 3 - PD 3 o 


CEF3-3 

PD 8 only 

pd 4 - pd 27 


CEF3-4 

- 

pd 8 - pd 26 


CEF3-5 

- 

PDo.4 ~ PD 24 


CEF3-6 

- 

pd,.5-pd 26 


kidney 

+ 

fibroblast 


pre-blastula (Stage X) 

+ 

embryo 


gastrula or neurula 

+ 

embryo 


Transformed in vitro 

RP-19 6 

+ 

turkey B cell 

MDV 

DT40 f 

+ 

B cell (bursal lymphoma) 

ALV 

RP-9 g 

+ 

B cell (lymphoblastoid) 

ALV 

MSB-l h 

+ 

T cell (spleen tumor cells in vitro) 

MDV 

MQ-NCSU 1 

+ 

macrophage (spleen phagocyte) 

MDV 

QT6 J 

+ 

quail fibroblast (fibrosarcoma) 

MC 

QT35 J 

+ 

quail fibroblast (fibrosarcoma) 

MC 

LMH and LMH/2A k 

+ 

hepatocyte (hepatocellular carcinoma) 

DEN 

293 cells 

+ 

human embryonic kidney 

HAdV-5 

Transformed in vivo 

spleen 

+ 

tumor cells 1 

MDV (RBIB) m 

thymus 

+ 

tumor cells 1 

MDV (RBIB) m 


a All of the above are chicken cells except where indicated. 

b ATCC = American Type Culture Collection; CEF = chicken embryo fibroblast; PD n = Population doubling; 
P n = passage number. 

c MDV = Marek's disease virus; ALV = Avian leukosis virus; MC = Methylcholanthrene; Den = 
Diethylnitrosamine; HAdV5 = Human adenovirus type 5. 

d Non-transformed at culture initiation, earlier passages were telomerase negative. 
e Nazerian et al. (1982); Nazerian (1987). 
f Baba et al. (1985). 
s Okazaki et al. (1980). 
h Akiyama and Kato (1974). 

1 Qureshi et al. (1990). 
j Moscovici et al. (1977). 
k Kawaguchi et al. (1987). 

1 Personal communication from Dr. Carol Cardona of the UC Davis School of Veterinary Medicine. 
m Schatetal. (1982). 
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Fig. 2. Telomere dynamics, telomerase profiles and growth data for a 
representative CEF culture. (A) Growth curve for CEF3-4 culture showing 
cumulative increase in population doubling over the lifetime of the culture 
and plateau as population doublings decrease (see Materials and methods). 
(B) Telomere shortening profile illustrates division-dependent telomere 
shortening (utilizing mean TRF values) from PD 5.5 to PD 26 and loss of 
telomeric DNA (percent telomeric DNA derived from the ratio of the values 
for PD n to PD 26 ) for the same timepoints. (C) Terminal restriction 


fragment (TRF) blot for PD 5.5 to PD 2 6 with 50 ng of /tadll-digested genomic 
DNA in each lane. (D) TRAP assays for CEF3-4 from PDg to PD 26 exhibit 
lack of telomerase activity except at PDg (lane 3) where there is a faint 50 bp 
band visible. Lane 1: 10 bp DNA ladder, Lane 2: blank, Lanes 3-9: PDg to 
PD 26 , Lanes 10 and 11: human 293 cells (positive control), Lane 12: CHAPS 
buffer with no cell extract (negative control), Lane 13: chicken neurula (posi¬ 
tive control). 


evident) at PD 4 and PDg respectively, the earliest timepoints 
for which protein extract samples were prepared for these cul¬ 
tures (see Fig. 2D, lane 3). Additionally, CEF3 culture 1 dis¬ 
played a faint 50 bp band at PD 23 , a point in the culture’s life¬ 
span wherein more than 90% of the cells displayed a senescent 
phenotype as indicated by cell morphology and a positive (3- 
galactosidase assay (data not shown). Notably, a single flask of 
senescent CEF2 cells which survived for over 250 days in cul¬ 
ture showed telomerase activity (Fig. IB). CEFs (PDig) which 
were obtained from ATCC exhibited no telomerase activity 
(Fig. 1A). Non-transformed adult chicken kidney fibroblasts 
and early stage chicken embryos were positive for telomerase 
(Table 1). 

Telomere length profiles and loss of telomeric DNA in chick¬ 
en embryo fibroblasts. The TRF Southern blots exhibited the 
expected smeared hybridization signal consisting of a series of 
overlapping TRF fragments ranging from about 8 to 23 kb, plus 


a broad band at 25-35 kb (which was excluded from this analy¬ 
sis). These TRF fragments represent the chicken Class II termi¬ 
nal telomere arrays previously determined to display division- 
dependent shortening (Delany et al., 2000; Taylor and Delany, 
2000). Telomere length profiles derived from TRF smears of 
cultures 1 to 6 (CEF3) were unexpectedly variable, with mean 
telomere length increasing and decreasing throughout the life¬ 
span of these cultures (see Figs. 2B and C and 3A and B). In five 
cultures, mean telomere length from early passages to senes¬ 
cence showed a net decrease ranging in size from 621 to 
2,191 bp. However, as shown in Fig. 3 and Table 2, a net 
increase in mean telomere length of 564 bp for CEF3 culture 3 
was observed. These changes in mean telomere length pro¬ 
duced a loss rate of 28 to 88 bp per cell division in the cultures 
with a decrease in mean TRF and a rate of increase of 25 bp per 
cell division in CEF3 culture 3. To supplement this data, a sec¬ 
ond measure of telomere shortening, loss of terminal telomeric 
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DNA throughout the lifespan of each culture, was also exam¬ 
ined by measuring integrated lane intensity over the same range 
of fragment sizes used to determine mean TRF length. Interest¬ 
ingly, in three cultures, terminal telomeric DNA exhibited 
increases ranging from 14 to 27% over the earliest PD’s for 


A Telomere Length Dynamics: CEF3 Cultures 1-6 
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which data were taken (data not shown). Subsequently all of the 
cultures demonstrated a striking loss of terminal telomeric 
DNA by senescence, with losses for all the cultures ranging 
from 40 to 85 % (see Table 2, Figs. 2B and 3). 

Transformed cells 

Telomerase activity in transformed avian cells and cell lines. 
As indicated in Table 1 and Fig. 1C, eleven transformed cell 
lines from three avian species (chicken, turkey and quail) repre¬ 
senting five cell types including B and T cells, macrophages, 
hepatocytes and transformed fibroblasts, showed telomerase 
activity. 

Telomere length profdes in transformed avian cell lines. No¬ 
tably, TRF profiles of MQ-NCSU cells, ATCC DT-40 cells, 
RP9 cells and LMH cells showed very little of the typical TRF 
smear of overlapping terminal restriction fragments (Fig. 4). In 
some cases, the smear was nearly indiscernible by eye and only 
detectable by densitometry. In fact the profiles of transformed 
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Fig. 3. Telomere Dynamics of CEF cultures 1-6. (A) Mean terminal 
restriction fragment (TRF) length was measured in six CEF cultures from 
early passages to senescence. TRF analysis showed an increase in lower 
molecular weight fragments in five of the six cultures by senescence, an indi¬ 
cation of telomere shortening. One of the six cultures showed an increase in 
mean TRF (CEF3-3). (B) In a second measure of telomere shortening, telo¬ 
meric DNA loss (Percent of Telomeric DNA = PDi n i t i a i:PD n ), all six cultures 
showed dynamic shifts culminating in eventual reductions in telomeric 
DNA, including CEF3-3. 


Fig. 4. TRF profiles of transformed cells resemble those of senescent 
CEFs lacking the typical smear of overlapping TRF fragments. Representa¬ 
tive TRF blots of samples from CEF3-4 cell culture and transformed cell 
lines. By senescence, much of the telomeric DNA smear in the CEF culture 
has disappeared. Interestingly, TRF blots of the transformed cell lines also 
show little or no TRF smear indicating that, relative to early passage CEFs, 
these transformed cells may possess a reduced amount of terminal telomeric 
DNA. 


Table 2. Telomere shortening and loss of telomeric DNA in six primary CEF cultures. Five of the six cultures showed an overall 
decrease in mean TRF length. The decrease in mean TRF for these five cultures ranged from 28 to 88 bp per cell division. The total 
telomeric DNA losses for all six cultures ranged from 40 % to 85 %. 


Culture 

Mean TRF length (bp) 

Mean TRF length (bp) 

Mean TRF/ population 

Loss of telomeric DNA b 


Early 

Senescent 


doubling (bp) a 


CEF3-1 

13,832 (PD,) 

13,211 (PD 23 ) 

-621 (PD, : PD 23 ) 

-28.0 

69% (PD, : PD 23 ) 

CEF3-2 

13,934 (PD 3 ) 

12,840 (PD 30 ) 

-1094 (PD 3 : PD 30 ) 

-40.5 

69% (PD 3 : PD 30 ) 

CEF3-3 

10,468 (PD 4 ) 

11,032 (PD 27 ) 

+564 (PDg.5: PD 27 ) 

+ 24.5 

54% (PD 4 : PD 27 ) 

CEF3-4 

16,999 (PD 8 ) 

15,205 (PD 26 ) 

-1794 (PD, 0 : PD 26 ) 

-87.5 

85% (PD 5 5 : PD 24 ) 

CEF3-5 

17,097 (PDo. 4 ) 

15,655 (PD 24 ) 

-1442 (PDq.5 : PD 24 ) 

-61.0 

63% (PD,m : PD 24 ) 

CEF3-6 

13,569 (PD,. 5 ) 

11,378 (PD 26 ) 

-2191 (PD,.5: PD 26 ) 

-88.0 

40% (PD, : PD 26 ) 


a Based on TRF smear analysis. 

b Based on total net lane intensities. Integrated signals from each lane were expressed as a percentage of the signal from early passage DNA as 
previously described (Harley et al., 1990). 


322 


Cytogenet Genome Res 102:318-325 (2003) 


















































































cells resembled the profiles of senescent CEFs in this respect 
(see Fig. 4). Mean TRF lengths for the four transformed cell 
lines were longer than the mean TRF lengths of CEF3 cultures 
1 to 6 at senescence (Table 3). 

Discussion 

The present study examined telomere length dynamics of 
Class II avian telomere arrays and telomerase activity in avian 
cells and cell lines, illustrating variability in telomere length 
patterns and documenting that a number of transformed avian 
cells possess telomerase activity. In contrast, primary CEFs 
exhibit little or no telomerase activity with rare exceptions, 
including early or late passages of normal CEFs and one flask of 
senescent CEF2 cells that survived over 250 days in culture. An 
earlier study (Venkatesan and Price, 1998) provided evidence 
for downregulation in vitro. Telomerase activity at an early 
point in the lifespan of a culture would be consistent with the 
presence of a small number of cells from a telomerase-positive 
population that had yet to be supplanted by the dominant tel- 
omerase-negative fibroblast cells. Theoretically, a small num¬ 
ber of cells can provide a positive TRAP assay result (Kim et 
al., 1994). Alternately, down-regulation of telomerase activity 
in CEFs in culture might occur because telomerase activity 
could require factors not present in vitro. Telomerase activity in 
a senescent culture would be consistent with the establishment 
of a post-crisis population of cells exhibiting a dysregulated tel¬ 
omerase expression profile, perhaps as a precursor to transfor¬ 
mation. 

Six CEF3 cultures derived from embryos of a highly inbred 
line and possessing little or no telomerase activity, exhibited 
variable telomere profiles including lengthening and shortening 
throughout their lifespan (Table 1 and Fig. 2). Most notably, all 
six cultures exhibited dramatic and potentially catastrophic 
loss of telomeric DNA by senescence (Table 2 and Fig. 2). The 
TRF blots for CEF3 cultures 1-6 reflected this loss of telomeric 
DNA in a diminished TRF smear. Interestingly, the TRF blots 
of transformed cells, all of which exhibited telomerase activity, 
exhibited longer mean TRF lengths than CEF3 cultures 1 to 6. 
However the TRF blots of transformed cells also showed little 
or no TRF smear, which may be evidence of pre-transforma¬ 
tion telomere erosion. 


Table 3. Mean TRF lengths for selected trans¬ 
formed cell lines and senescent CEFS 


Cell type 

Mean TRF 

RP-9 

18,235 

MQ-NCSU 

16,530 

DT40 

18,731 

LMH 

18,945 

CEF3-1 (PD 23 ) 

11,945 

CEF3-2 (PD 30 ) 

12,840 

CEF3-3 (PD 27 ) 

11,032 

CEF3-4 (PD 26 ) 

15,205 

CEF3-5 (PD 24 ) 

15,655 

CEF3-6 (PD 26 ) 

11,378 


The end-replication problem, wherein telomere shortening 
occurs passively as a result of incomplete replication of the par¬ 
ental DNA strands, explains only relatively small losses of tel¬ 
omeric DNA over the lifespan of a primary cell culture. Mean 
TRF length profiles provided evidence for losses of telomeric 
DNA in the range of 28 to 88 bp per population doubling, a 
relatively low rate of telomere attrition. However, by another 
measure, percent telomeric DNA, losses were dramatic. These 
losses of telomeric DNA, ranging from 40-85%, suggest that a 
mechanism other than incomplete end-replication is operating. 
Such a dramatic erosion of telomeric DNA may precede chro¬ 
mosomal end fusions (Chan et al., 2003). Catastrophic telo¬ 
mere erosion is an early event in DNA damage-induced apop¬ 
tosis that may be produced by the release of reactive oxygen 
species due to loss of mitochondrial membrane potential (Ram¬ 
irez et al., 2003) or by loss of the protective effect accessory 
proteins or telomerase afford telomeres (Chan et al., 2003). 

Apparent increases in telomere length, as measured by 
mean TRF values, can be explained either by shifts in clonal 
populations or by critically short telomeres prompting a recom¬ 
bination pathway leading to telomere elongation which in turn 
allows a few cells to regain proliferative potential and reestab¬ 
lish the culture (Ijpma and Greider, 2003 and references there¬ 
in). Cells utilizing the mechanism referred to as alternative 
lengthening of telomeres or AFT (Reddell, 2003) may invoke 
recombination-mediated lengthening of critically shortened 
telomeres by strand invasion; annealing of a DNA strand from 
one telomere to the complementary strand of another telomere 
which acts as a template for synthesis of new telomere repeats 
(Dunham et al., 2000; Varley et al., 2002). A hallmark of tumo- 
rigenesis is either the persistence of telomerase or an AFT 
mechanism that enables tumor cells to evade DNA-damage 
pathways. Persistence of telomerase or induction of an AFT 
mechanism in immortalized cells may prevent apoptosis by sta¬ 
bilizing telomeres. The existence of an AFT mechanism in non- 
transformed CEFs is purely speculative as no normal cells with 
such a mechanism have been detected (Reddel, 2003). 

The inconsistency between the two measures of telomere 
shortening used in this study, mean TRF and percent telomeric 
DNA, may be explained in the context of emerging and declin¬ 
ing cell lineages with varying TRF distributions. For example 
two cell populations, one with TRF values falling in a normal 
distribution over a broad range and the second with a skewed 
TRF distribution falling over a narrow range, can produce the 
same mean TRF. Thus the elimination of clonal populations 
(signaled by critically short telomeres) from the replicating pool 
of cells, followed by a second clonal population with different 
parameters assuming dominance within the culture, could 
eventually shift the mean TRF length upward or downward. 
This shift, however, might be accompanied by an overall loss of 
telomeric DNA as a growing proportion of cells in the culture 
reach a telomeric crisis. The inconsistency between different 
measures of telomere shortening suggests that relying only 
upon mean TRF when analyzing telomere dynamics can pro¬ 
duce misleading results. 

Based upon our results, it is proposed that two distinct 
modes of telomere shortening were observed in the six CEF3 
cultures. The first mode, telomere attrition due to the end-repli- 
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cation problem,induced cycles of telomere shortening with the 
demise of one lineage followed by the emergence of a new dom¬ 
inant lineage. This cycling of lineages within a culture could 
produce waves of lengthening and shortening. When all of the 
lineages of cells making up a culture had achieved critically 
short telomeres, senescence occurred followed by crisis. Crisis 
was accompanied by massive telomere loss due to a second 
mode of telomere shortening, telomere erosion, perhaps in¬ 
duced by oxidative DNA damage or down-regulation of telo¬ 
mere-associated proteins, followed ultimately by the demise of 
the culture. The lack of an intense Class II TRF smear in the 
blots of telomerase-positive, transformed cells examined in this 
study suggests that, due to the induction of a telomere-stabiliz¬ 
ing mechanism such as the up-regulation of telomerase, these 
lineages were able to survive and proliferate despite loss of the 
normal chicken telomere length profile. 

An alternate explanation for both the cycling in TRF length 
as well as the dramatic loss in telomeric DNA near the end of 
the lifespan of the CEF3 cultures might be a pattern of “break- 
age-fusion-ring-bridge-breakage” caused by end-fusion of chro¬ 
mosomes with short telomeres (McClintock, 1939). The shift¬ 
ing of terminal telomeric sequences to an interstitial location 
due to end fusions followed by breakage of the fused chromo¬ 
somes could produce the pattern of shortening and lengthening 
detected by changes in TRF smears for the CEF3 cultures. 

Rodents have long been used as model organisms for the 
study of human aging. However, the murine model may not be 
optimal for studies of human replicative senescence for a num¬ 
ber of reasons. Wild-type rodent somatic cells can retain telom¬ 
erase activity and do not appear to display division-dependent 
telomere shortening (Prowse and Greider, 1995; Forsyth et. al., 
2002; Kim et al., 2002). Both human and chicken somatic cells 
lack telomerase; with down-regulation of telomerase activity 
occurring early in development (Forsyth et al., 2002). Recent 
studies have demonstrated division-dependent telomere short¬ 


ening in chicken chromosomes both in vivo and in vitro and the 
ontological down-regulation of telomerase in most chicken 
somatic tissues in vivo (Delany et al., 2000, Taylor and Delany, 
2000). Chicken and human primary fibroblast cells are general¬ 
ly refractory to spontaneous immortalization, in contrast to 
mouse fibroblasts. (Lima and Macieira-Coelho, 1972; Lima et 
al., 1972; Macieira-Coelho and Azzarone, 1988; Prowse and 
Greider, 1995). Also, there is a fundamental difference between 
human and mouse telomere damage signaling mechanisms 
(Smogorzewska and de Lange, 2002). Many of the known 
telomere proteins, including TRF1, TRF2, Potl, RAP1 and 
tankyrase are highly conserved between chicken and human 
(Konrad et al., 1999; Price, 2001; Wei et al., 2002 and refer¬ 
ences therein). 

The results of this study further establish the similarities 
between human and avian telomere biology by demonstrating 
that telomere dynamics in non-transformed and transformed 
chicken cells are consistent with what has been observed in 
human cells. The significance of these similarities is not dimin¬ 
ished by the existence of megabase telomere arrays in chicken 
as it appears to be the shortest telomere, not average telomere 
length that is critical for cell viability and chromosome stability 
(Hemann et al., 2001a). It is foreseeable, therefore, that the 
chicken may prove to be an extremely important model for 
studies of human cellular senescence and cellular transforma¬ 
tion. Further research utilizing molecular and cytogenetic tech¬ 
niques, protein expression profiles and apoptosis assays will 
provide valuable insight into telomere-mediated pathways to 
senescence and oncogenesis in chickens and other birds. 
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Abstract. Chromosome-specific paints from macrochromo¬ 
somes 1-9 and Z of the chicken were hybridised to metaphases 
of the red-legged partridge and revealed no inter-chromosomal 
rearrangements. The results from chromosome painting are 
similar to previous studies on the Japanese quail but different 
from findings in guinea fowl and several species of pheasant. 
The difference in centromere position in chicken and partridge 


chromosome 4, previously assumed to be the result of an inver¬ 
sion, was confirmed. However, FISH mapping of BAC clones 
from chicken chromosome 4 revealed that the order of loci was 
the same in both species, indicating the occurrence of a neocen¬ 
tromere during divergence. 

Copyright©2003 S. Karger AG, Basel 


The red-legged partridge (Alectoris rufa , ARU) belongs to 
the order Galliformes, family Phasianidae. It is found mostly in 
Mediterranean areas (Spain, France, Portugal), but also in parts 
of northern Italy and southern England. The partridge is popu¬ 
lar as small game and many hunters consider it to be one of the 
most prized species in Spain. Besides hunting, it also has an 
important ecological value as it represents the characteristic 
avifauna of southern Europe. The number of wild red-legged 
partridge has decreased for a number of reasons, including 
excessive hunting, high predation, inadequate reintroductions 
with captive partridges from farms (some crossed with other 
partridge species), destruction of its natural habitat and persis¬ 
tent drought. 
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The genetic map of the chicken (Gallus gallus domesticus , 
GGA) is well advanced since it is a model species suitable for 
developmental studies and has a conserved genome of small 
size, 1,200 Mb, approximately 40% of the human genome 
(Burt and Cheng, 1998; Griffin et al., 1999). Comparative stud¬ 
ies based on chromosome painting using chicken probes have 
been applied to emu, (Shetty et al., 1999; Grutzner et al., 2001), 
great grey owl (Schmid et al., 2000), pheasants (Schmid et al., 
2000; Grutzner et al., 2001), quail, American rhea (Grutzner et 
al., 2001), California condor (Raudsepp et al., 2002) and guinea 
fowl (Shibusawa et al., 2002). 

However, only a few studies have been performed on the 
molecular genetics and cytogenetics of this species of partridge 
which, like the chicken, has a 2n = 78 karyotype (Arruga et al., 
1996). It was noticed that while chromosomes 4 and 8 are sub- 
metacentric and metacentric respectively in chicken, both are 
acrocentric in the partridge. Between chicken and the par¬ 
tridge, chromosomal homologies were studied only on chromo¬ 
somes 1, 4 and Z by chromosome painting (Dias et al., 1995; 
Ramos et al., 1999). As GGA4-specific paint probe hybridised 
throughout the entire ARU4, Ramos et al. (1999) concluded 
that the difference in the centromere position was most likely 
due to an inversion. 
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Fig. 1 . Chromosome painting of chicken 
probes on partridge chromosomes. The size of 
partridge Z chromosome (green) is the closest to 
chromosome 4 as shown by two-colour FISH. 



In this paper we have applied chromosome-specific paint 
probes derived from GGA1-9 and Z to the partridge and have 
used specific BAC clones from GGA4 to characterize the chro¬ 
mosome 4 rearrangement between chicken and the partridge. 
Our results indicate that the order of loci in chromosome 4 in 
both species is the same and suggest that the change in the cen¬ 
tromere position has resulted from the occurrence of a neocen¬ 
tromere during divergence. 

Materials and methods 

Chromosome preparation 

Chromosome metaphases were prepared from blood cells by standard 
methods with several modifications: Peripheral blood was obtained from a 
male red-legged partridge. The blood was dispensed into 1-ml sterile vials 
containing 15 u.s.p.u. sodium heparin. Blood cultures were prepared in 
RPMI 1640 medium supplemented with 20% fetal calf serum, 1 % L-gluta- 
mine, 1 % pokeweed mitogen and 1 % antibiotic-antimycotic and incubated 
at 38 °C. The cultures were harvested after 72 h. Colcemid was added to a 
final concentration of 0.05 mg/ml during the last 1 h of incubation. 

DNA probes 

Chromosome-specific DNA was made from DOP-PCR amplified chick¬ 
en macrochromosomes 1-9 and Z sorted using a dual laser flow cytometer as 
previously described (Griffin et al., 1999). Six chicken BAC clones, B39-K10, 
R23-N15, R36-K22, R40-P18, R46-J4 and R47-L5, were selected from the 
chicken BAC database (Lee et ah, 2003; http://hbz.tamu.edu) and obtained 
from the Texas A&M University in the USA. Each clone has been assigned to 
a chicken gene or marker; EDNRA (109 cM), MCW0114 (82 cM), ADL0143 


(3 cM), ADL0255 (0 cM), ADL0317 (12 cM) and ADL0145 (66-82 cM) 
respectively (Groenen et ah, 2000, Suchyta et ah, 2001, http://poultry.mph. 
msu.edu/resources/Resources.htm). The culture and extraction followed a 
standard protocol. 

Fluorescence in situ hybridisation 

Hybridisation of probes and their detection followed the method of Yang 
et ah (1995) with some modification. Briefly, the metaphase preparations 
were denatured in 70 % formamide, 2* SSC at 63 0 C for 20 s and dehydrated 
through an ethanol series. Each chromosome-specific probe was labeled with 
biotin- 16-dUTP or digoxigenin-11-dUTP (Roche), and hybridised in turn at 
37 ° C overnight to separate metaphase preparations. The slides were washed 
in 50% formamide, 2x SSC at 43 °C for 10 min and then the hybridisation 
signals were detected by Cy3-avidin (Amersham) for biotin probes or mouse 
anti-digoxigenin (Roche) followed by FITC rabbit anti-mouse antibodies for 
digoxigenin probes. The slides were counterstained with DAPI, mounted in 
Vectashield (Vector) and examined by digital fluorescence microscopy. 


Results 

Chromosome-specific paint probes of GGA1-9 and Z each 
hybridised to only one pair of macrochromosomes in the par¬ 
tridge without hybridisation to microchromosomes (Fig. 1). It 
appeared that the identity of each partridge chromosome pair 
corresponded to a pair with the identical number in chicken, 
although chromosome 4 is submetacentric and chromosome 8 
is metacentric in chicken, and both are acrocentric in the par¬ 
tridge. 
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Fig. 3. Evolution of an ancestral common chromosome of chicken and 
partridge 4 formed by the fusion of an acrocentric chromosome and a 
microchromosome. The orientation of BAC clones towards the long arm tel¬ 
omere is not inverted but has the same orientation between chicken and par¬ 
tridge. Although the position of the centromere is different, it seems that the 
genomic orientation is conserved through this rearrangement. The letters 
indicate the BAC clones shown in Fig. 2. 


Five BAC clones from the short arm of GGA4 hybridised to 
the long arm of ARU4 (Figs. 2, 3). Three clones, R36-K22, 
R40-P18 and R46-J4, hybridised close together at the distal 
end of the short arm of GGA4, and the long arm of ARU4 prox¬ 
imal to the centromere (Fig. 3a, b, c). R47-L5 located in the 
proximal short arm of GGA4 hybridised to the long arm of 
ARU4 (Fig. 3d). R23-N15 was located very close to the short 
arm centromeric region in GGA4 (Fig. 3e). B39-K10 was found 
at almost the same distance from the distal end of the long arm 
in both species (Fig. 3f). R23-N15 also showed a weak signal in 
both species at the same sites; the centromeric region of chro¬ 
mosome 1 and 8, and one of the telomeric regions in GGA8 and 
the telomeric region of ARU8q. 


Fig. 2. Localization of chicken BAC clones to chicken (left) and par¬ 
tridge (right) chromosomes 4. (a) R36-K22, (b) R40-P18, (c) R46-J4, 
(d) R47-L5, (e) R23-N15 and (f) B39-K10. 
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Discussion 

Comparative studies using chromosome painting have been 
undertaken recently among avian species (Shetty et ah, 1999; 
Schmid et ah, 2000), but so far only about 13 species have been 
compared and all these comparisons have been with chicken, 
and specifically with GGA1-9 and Z (Schmid et ah, 2000). 
These studies are as yet limited in that in only seven species 
were most probes employed successfully. Only the Japanese 
quail ( Coturnix coturnix japonica , 2n = 78) showed apparently 
complete homology with the macrochromosomes of the chick¬ 
en (Shibusawa et al., 2001). Similar results were obtained in the 
emu (Shetty et al., 1999) with the exception that GGA4 paint 
also hybridised to one microchromosome. The same observa¬ 
tion was made in the great grey owl and three gallopheasants 
(Phasianus colchicus , 2n = 82; Chrysolophus pictus, 2n = 82; 
Lophura nycthemera , 2n = 80, Schmid et al., 2000); in addition, 
GGA2 hybridised to two separate chromosomes numbered 3 
and 6 (Schmid et al., 2000). GGA4 corresponds to two ma¬ 
crochromosomes also in the California condor (Raudsepp et 
al., 2002). In the guinea fowl, GGA4 is homologous to the 
whole long arm of chromosome 4 and a microchromosome 
(Shibusawa et al., 2002). In the present study, our results with 
the partridge are similar to those previously observed in the 
Japanese quail and indicate similar close conservation with the 
chicken. 

Homologies of GGA4 to human chromosomes are sepa¬ 
rated at the centromere, and GGA4p and GGA4q are homolo¬ 
gous to parts of human chromosome X and 4 respectively 
(Groenen et al., 2000). Thus GGA4 is suggested to have arisen 
from a fusion of two ancestral bird chromosomes (Schmid et 
al., 2000). Hybridisation of human chromosome 4 paint to 
GGA4q indicates high conservation of this region over 300 
million years (Chowdhary and Raudsepp, 2000). Although the 
Californian condor chromosome 4 is submetacentric, it is 
believed to share homology with chromosome 4 in emu, pheas¬ 
ants, great grey owl and human (Raudsepp et al., 2002). Six 
cosmid clones mapped on GGA4p were localised on chromo¬ 
some 4q of Japanese quail and on a microchromosome in guin¬ 
ea fowl (Shibusawa et al., 2001, 2002). A microchromosome 
with a conserved region of human chromosome Xq is postu¬ 
lated to have fused to the ancestral acrocentric chromosome 4 
in chicken, old world quails and peafowl after these species div¬ 
erged from the lineage with the ancestral karyotype (Shibusawa 
et al., 2002). The finding that the GGA4 paint probe maps to 
both chromosome 4 and a microchromosome suggests that 
ARU4 also arose by the fusion of an acrocentric chromosome 
and a microchromosome (Fig. 3). 

By comparison of the loci of chicken BAC clones in the two 
species, we investigated the morphological difference between 
GGA4 and ARU4. The orientation of the six chicken BAC 
clones was the same in the partridge from 4cen to 4qtel. The 
similar orientation has been observed between chromosome 4 
in the chicken and Japanese quail (Shibusawa et al., 2001), but 
interpreted as the consequence of more than two rearrange¬ 
ments. A centromeric clone on the GGA4 short arm, R23-N15, 
seems to include a centromere specific sequence shared with 
chromosome 1 and 8. It hybridised to two sites in chromo¬ 


some 8, suggesting that this is a result of a centromere-telomere 
inversion, or simply duplication of a family of sequences at sev¬ 
eral sites. 

The study by Ramos et al. (1999) suggested that the mor¬ 
phological differences between GGA4 and ARU4 could be 
caused by a para- or pericentric inversion in partridge, however 
our results show that this homology could not be explained by a 
simple inversion. Assuming that this fusion represents a recent 
evolutionary event and the orientation of loci on chromosome 
4 is the same in both species (Fig. 3), the observed difference in 
chromosome 4 centromere position can be explained only by 
the formation of a neocentromere during divergence. It is also 
possible that GGA4 and ARU4 arose independently by the 
fusion of the same acrocentric chromosome and the same 
microchromosome, different only in centromere position. 
There is precedence for such centromere repositioning in the 
karyotype evolution of chromosomes corresponding to human 
chromosomes 4, 9, 10 and X in primates (Eder et al., 2003; 
Montefalcone et al., 1999; Carbone et al., 2002; Ventura et al., 
2001). Although neocentromere formation has been reported 
only in mammalian chromosomes so far, the same mechanism 
could be expected to occur in other species and, in the present 
case, during evolution of either GGA4 or ARU4. It is observed 
that alphoid sequences may persist through evolution at the 
sites of inactivated centromeres, for example at human chro¬ 
mosome 2q21 (Baldini et al., 1993) and so molecular studies 
may help to confirm or refute the occurrence of a neocentro¬ 
mere as postulated here. 
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Abstract. Efforts to build a comprehensive genetic linkage 
map for the turkey (Meleagris gallopavo) have focused on 
development of genetic markers and experimental resource 
families. In this study, PCR amplification was attempted for 
772 microsatellite markers that had been previously developed 
for three avian species (chicken, quail and turkey). Allelic poly¬ 
morphism at 410 markers (53.1 % of total examined) was deter¬ 
mined by genotyping ten individuals (six Fi parents and four 
grandparents) in a new resource population specifically devel¬ 
oped for genetic linkage mapping. Of these 410 markers, 109 
(26.6%) were polymorphic in the tested individuals, with an 
average of 2.3 alleles per marker. Higher levels of polymor¬ 


phism were found for the turkey-specific markers (61.1 %) than 
for the chicken (22.7%) or quail-specific markers (33.3%). To 
test the fidelity of the matings, demonstrate the power of these 
families for linkage analysis, and determine genetic linkage 
relationships, 86 polymorphic markers were genotyped for up 
to 224 birds including founder grandparents, parents and F 2 
progeny. Finkage relationships for many of the chicken mark¬ 
ers elucidated in the turkey were comparable to those observed 
in the chicken. These data demonstrate that the new UMN/ 
NTBF resource population will provide a solid foundation for 
constructing a comparative genetic map of the turkey. 

Copyright©2003 S. Karger AG, Basel 


Microsatellite markers that display cross-species amplifica¬ 
tion and polymorphism have greatly facilitated comparative 
genomics of closely related species. For example, within artio- 
dactyls, cross-species amplification is seen for roughly 70% of 
the microsatellites among sheep, cattle, and goat (Reed and 
Beattie, 2001), and extensive conservation is present in both 
marker order and spacing between sheep and cattle (Crawford 
et al., 1995; de Gortari et al., 1997). Perhaps the most extreme 
examples are cattle microsatellites that were not assignable in 
cattle but were mapped in ovine reference populations (Freking 
et al., 1998). 
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Studies of avian microsatellite markers have reported 
mixed results from cross-species applications (Fevin et al., 
1995; Fiu et al., 1996; Hanotte et al. 1997; Primmer et al., 
1997; Reed et al., 1999, 2000a). Amplification across species 
generally exceeded 50% of the markers tested, but levels of 
heterozygosity varied due in part to the different test animals 
used for screening. An expanded study of 520 chicken markers 
Reed et al. (2000a) achieved a 54% amplification rate in the 
turkey and encouraging levels of polymorphism (-35%) were 
observed for the 57 markers genotyped. Results from this study 
indicated that chicken microsatellites could be utilized in map¬ 
ping the turkey genome and, more importantly, would provide 
anchor points for comparative mapping. 

The foundation of any map building effort or mapping pro¬ 
gram is the statistical power afforded by the reference popula¬ 
tion. Reference families should be constructed to facilitate both 
linkage detection and locus ordering with maximum efficiency. 
Exotic crosses may increase the power of the reference families 
due to increased parental heterozygosity. However, marker 
alleles identified in these populations may or may not be 
present in commercial populations. A small mapping/reference 
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Fig. 1 . Pedigree of the UMN/NTBF mapping 
resource population. 



panel for turkeys has been developed at Brigham Young Uni¬ 
versity from matings of two wild male turkeys each to three 
partially inbred Orlopp line C hens (Huang et ah, 1999). The 
largest family includes a total of 72 backcross progeny from a 
single male. This panel was established using a rationale similar 
to that of Crittenden et al. (1993) in developing the East Lans¬ 
ing chicken reference panel, where divergence between com¬ 
mercial and wild turkeys is expected to maximize heterozygosi¬ 
ty in the Fi. However, the small number of offspring in this 
reference population limits statistical support for linkage and 
long-term utility of the mapping panel. Moreover, the back- 
cross design is less powerful than the F 2 design for mapping 
codominant markers (Da et al., 2002). 

A second mapping/reference panel was used by Harry et al. 
(2003) to create a linkage map for the turkey based on RFLP 
mapping of expressed sequences (cDNAs). This panel was 
derived from a single Fi sire (an intercross between two com¬ 
mercial lines, A and B) backcrossed to two dams from line B. 
Because this panel was used extensively for RFLP genotyping, 
a method requiring significant quantities of genomic DNA, 
only limited amounts of DNA remain making this resource 
inappropriate for long-term use. This backcross population has 
the same statistical limitations as the BYU population. There¬ 
fore, we have developed a new F 2 reciprocally mated reference 
population to generate a medium-density genetic map of the 
turkey. This study examines genetic polymorphism at avian 
microsatellite markers in this new population and demon¬ 
strates its value for constructing a genetic linkage map for the 
turkey. 


Materials and methods 

Construction of resource population 

All phases of animal husbandry were conducted at Nicholas Turkey 
Breeding Farms (NTBF), Sonoma CA. The resource population was initiated 
by crossing two divergent production lines to produce the Fi generation. One 
line (designated yield, Y) produces an unusually high yield of breast meat 


Table 1. Summarized results from testing avian microsatellite markers in 
turkey 


Species of 
origin 

Markers 

tested 

Markers 

amplified 

Monomorphic 

Polymorphic 

Chicken 

678 

356 

275 

81 

Quail 

49 

18 

12 

6 

Turkey 

45 

36 

14 

22 

Total 

772 

410 

301 

109 


while the other line (designated fecund, F) produces an unusually large num¬ 
ber of poults. A single male and 16 females were retained as breeders from 
within each of five Y x F families. The five Fi males were then each mated to 
12 unrelated F t females using a modified factorial design and F 2 poults 
hatched in early 2000. Hatches from the sixty Fi families produced over 
2000 total F 2 offspring. After completion of hatching, blood was drawn and 
frozen. 

The total number of F 2 offspring produced in the Fi cross far exceeded 
the number desired for genetic mapping. However, this design was used to 
ensure redundancy against unanticipated male infertility, hen mortality, or 
other problems with the expectation that only a few families would be select¬ 
ed for constructing a mapping panel. Selection of the subset of Fi families to 
comprise the UMN/NTBF mapping population was based on total number 
of offspring per family, distribution of offspring across the Fi dams, and 
relatedness of the Fi parents. Families of four dams (D1042, D1044, D1049, 
and D1057) from two of the five Fi sires (D7491 and D3804) were chosen for 
use in map construction (Fig. 1). These four families include 207 F 2 offspring 
and are related to each other through two grandsires (C6013 and C6014 in 
the P generation). 

DNA isolation 

DNA was extracted from whole blood using the following procedure. 
Briefly, 400 pi whole blood (collected in EDTA) was incubated in 20 ml of 
cell lysis buffer (100 mM NaCl, 10 mM Tris-HCl, 25 mM EDTA, 0.5 % SDS, 
pH 8.0) at 60 °C for 10 min. The resulting solution was extracted twice with 
1:1 phenol chloroform and the aqueous phase recovered with Phase Lock Gel 
(Eppendorf). To the aqueous phase 2.5 ml 7.5 M ammonium acetate and 
25 ml isopropanol were added and the DNA was spooled, washed with 75% 
ethanol and resuspended in 10 mM Tris HC1, 1 mM EDTA. 
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Table 2. Primers redesigned for several TUM 
markers from GenBank entries and for two 
chicken markers from turkey sequence 


Marker Forward primer 


Reverse primer 


TUM01 
TUM08 
TUM09 
TUM 11 
TUM 14 
TUM 15 
TUM 16 
TUM 17 
TUM 18 
TUM20 
TUM25 
TUM26 
TUM28 
TUM32 
TUM36 
TUM48 
TUM49 

(MNT)ADL0315 

(MNT)LEI0070 

MCW0034 


CT GAAAGGGATGGAGGGAAG, 

CCGCCCTACTCACTCACTCA, 

GTCACATTGCCTCTGGGTTT, 

T CC AAACT GT GTTT CC AACTTC, 

CT CGGGT GC AG AT AGTTT CC, 
TGCTTGTAACATGTCATCCTG, 
GACCC ACCAAGT GTTTTCTT C, 

TT AA AT GGT GTTT GGGGAA AG, 
ACT G AG A A AAT GCCGCTC AG, 

C AC ATT C ATT CTTT CTT CT GT GC 

A AAT CAT GT ATTT GCC AT ACGC 

GACCC ACCAAGT GTTTTCTT C, 

CCAAAGGAAGGAGGAACC, 

CAACTTCGCTCCAGCACAC 

CGCCCTCCGCACAC 

T G AAT AGC AT GGGG A AT C AG 

CATCTCCCCCTTCCCATC 

TCGAGATCSTCCATGAATACG 

GG A ATT C ACGT GT GGT GCTC 

TGCACGCACTTACATACTTAGAGA 


GTT GT G AAT GCCTCT CAT GG 
AT AGGGGGAAGGGGT GT G 
CCCGCCAGTGTGCTCTAA 
GT GT GGCAGT CTGAATGAGG 
TCCTGTCCCCTCTTCCTCTC 
GCAAAATCATTCAGCAGCAG 
CC AT CCT C ACCTT GTTTT GG 
T AATT GGTT GGGGGAGG A AC 
CCTT CTTTT ATT C ATCC AG AT CTT A 
T G AGC AGTTCCT GT GT AGG AC 
T GT CT AGAT GAGCAGCTGAATG 
CCT C AC ATT GTTTT GGTTT GG 
CCA ACC AT GGT GT G AGG AG 
GGGAAACGGAAACAT GAAAG 
T G AACT GGCT GGAGTT GTT G 
GCCATTCTGTGATTTTGTGC; 

GCA A AGG ATT AT GGGT GTCC; 
CCTT GTT AAGT GGGT GAG A AT G 
AT GCT CTCGT GT G AGGT GT G 
TCCT GGGTT AACT GCT G A AAG. 


Screening of microsatellite markers 

A total of 772 chicken, quail, and turkey microsatellite primer pairs (Ta¬ 
ble 1) was tested for amplification on pooled DNA from four turkeys (NTBF 
commercial line). Primers included 678 from the chicken (Comprehensive 
Mapping Kits 1-7 courtesy of Jerry Dodgson, US Poultry Genome Coordi¬ 
nator, ADL, LEI, MCW and ROS, and several obtained from F.A. Ponce de 
Leon, UMA), 49 developed for the Japanese quail ( Coturnix japonica, 
Kayang et ah, 2000; M0001 and GUJ [except GUJ0007]) and 45 developed 
for the turkey (Donoghue et al., 1999, RHT; Huang et al., 1999, TUM\ Latch 
et al, 2002, WT). To improve amplification, primers were redesigned for 
several TUM markers from GenBank entries and for two chicken markers 
from turkey sequence. Primer sequences for these markers are listed in 
Table 2. 

PCR reactions (12 pi total volume) contained 0.25 unit of Taq poly¬ 
merase (Sigma, Inc.), 0.2 mM each dATP, dCTP, dTTP, dGTP, 7.5 pmol of 
each primer, and 20-25 ng template DNA. Each primer pair was used at 
previously determined PCR conditions (Reed et al., 2000a) or optimized by 
testing over a range of annealing temperatures (50-60 °C) and MgCl 2 con¬ 
centrations (1.5, 3.0 and 4.5 mM). Amplifications were performed in a 
Touchgene thermal cycler (Techne). Reaction conditions included an initial 
incubation of 5 min at 94 0 C followed by 30 cycles of 1 min at 94 0 C, 1 min at 
annealing temperature, and 30 s at 72° C with a final 5-min extension at 
72 °C. PCR products were evaluated by electrophoresis in 2% agarose gel 
stained with ethidium bromide and sized by comparison to a OX Hae III 
marker ladder. 

Polymorphisms at each marker were determined by genotyping ten of 
the founding individuals of the resource families. These included the four 
Fi females and two Fi males plus the two grandsires and two granddams 
of the F! males. DNA fragments for each marker were amplified and 
labeled for electrophoresis by substituting 33 P-dATP (0.3 pmol) in the PCR 
reaction. PCR products were denatured at 94 °C and electrophoresed 
through 5 % acrylamide gels. After autoradiography, allele sizes were deter¬ 
mined by comparing amplified fragments to size markers (M13 sequencing 
reaction). 

Linkage analysis 

To test the fidelity of the matings, DNAs were extracted for all F 2 indi¬ 
viduals of the four dam families and genotypes were determined at three 
chicken microsatellite markers (LEI0106, LEI0107, and MCW0036). With 
one exception (an offspring from Dam 1057), the genotypes of all individuals 
were consistent with Mendelian expectations given the known parent-off¬ 
spring relationships. The one exceptional individual was subsequently 
removed from the population. Thus, the complete three-generation UMN/ 
NTBF resource population consists of 224 individuals (ten founder grand¬ 
parents, six Fi parents and 206 F 2 offspring). To facilitate distribution to 


other researchers (see below), a “distribution version” of the resource popula¬ 
tion has been created by eliminating 34 F 2 offspring from across all families. 
This version of the mapping panel comprises two, 96-well plates, and encom¬ 
passes all founder grandparents, parents and 172 F 2 offspring. 

To test the power of the UMN/NTBF resource population for linkage 
analysis and determine linkage relationships, genotypes for all informative 
chicken, quail and turkey microsatellite markers (Table 1) were determined 
for the four individual dam families (P, Fi and F 2 individuals). Linkage dis¬ 
tances and marker ordering were determined using Locusmap software 
(Garbe and Da, 2003). The Kosambi map function was used to translate 
recombination frequencies into map distances in centimorgans (cM). 


Results 

Locus polymorphism 

Testing of avian microsatellite markers involved first check¬ 
ing each primer pair for amplification of turkey DNA, examin¬ 
ing the amplifiable markers for polymorphism in a test panel, 
and finally genotyping informative markers on the UMN/ 
NTBF resource population. Of the 772 avian primer pairs 
tested, 410 (53.1 %) yielded analyzable PCR products including 
356 chicken, 18 quail, and 36 turkey microsatellites. Other 
chicken markers yielded large (>500 bp) amplified products. 
However, because such larger fragments are not easily ana¬ 
lyzed, these markers were not further investigated. In addition, 
twenty-five chicken markers that yielded multiple PCR frag¬ 
ments were genotyped. Of these, 21 were monomorphic and 
four produced at least one polymorphic fragment. A complete 
list of all avian microsatellite markers that produced a product 
in turkey and were tested for polymorphism is available from 
the senior author. 

Of the 356 chicken markers examined, 275 were mono¬ 
morphic with a single allele present in all ten individuals (in¬ 
cluding six displaying possible null alleles) and 81 (22.7 %) were 
polymorphic (including two displaying null alleles). In one case, 
(LEI0070) new primers were specifically designed to circum¬ 
vent primer-based null alleles. Overall, the number of alleles 
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Table 3. Utility of polymorphic avian 
microsatellite markers tested for genetic linkage 
mapping on the UMN/NTBF turkey resource 
population 3 


Marker 

Annealing 
temperature (°C) 

[Mgcy 

Amplicon size 
(bp) 

Result b 

No. alleles 

Informative 

meioses 

Chicken 

chrm c 

ADL0107 

54 

1.5 

181-183 

Pi 

2 

208 

7 

ADL0142 

58 

1.5 

214-226 

Pi 

3 

346 

6 

ADL0149 

60 

1.5 

236-238 

Pi 

2 

158 

17 

ADL0180 

56 

4.5 

171-189 

Pn 

4 

NA 

7 

ADL0184 

56 

1.5 

130-153 

Pi 

2 

204 

18 

ADL0188 

56 

4.5 

125-127 

Pi 

2 

42 

1 

ADL0254 

60 

1.5 

89-91 

Pi 

2 

207 

28 

ADL0262 

60 

1.5 

112-121 

Pn 

3 

NA 

23 

ADL0263 

54 

1.5 

130-160 

Pi 

5 

354 

14 

ADL0266 

54 

4.5 

101-109 

Pi 

2 

357 

4 

ADL0272 

58 

1.5 

168-178 

Pi 

2 

149 

10 

ADL0289 

54 

1.5 

74-78 

Pn 

2 

NA 

23 

ADL0292 

60 

1.5 

125-137 

Pi 

3 

262 

5 

ADL0306 

56 

1.5 

112-116 

Pi 

2 

47 

3 

ADL0315 d 

54 

1.5 

250-254 

Pi 

2 

214 

7 

ADL0317 

60 

1.5 

152-156 

Pi 

2 

51 

4 

ADL0320 

60 

1.5 

114-122 

Pi 

3 

252 

6 

ADL0350 

54 

1.5 

91-93 

Pn 

2 

NA 

1 

ADL0353 

58 

1.5 

154-160 

Pn 

2 

NA 

1 

ADL0377 

54 

1.5 

143-145 

Pi 

2 

301 

6 

BCL2 

54 

1.5 

189-207 

Pn 

2 

NA 

2 

GCT0055 

54 

1.5 

190-196 

Pn 

3 

NA 

12 

LEI0043 

56 

1.5 

107-110 

Pi 

2 

50 

3 

LEI0064 

60 

1.5 

308-317 

Pi 

2 

47 

7 

LEI0066 

60 

1.5 

300-328 

Pi 

4 

357 

14 

LEI0070 d 

60 

1.5 

174-182 

Pi 

3 

49 

2 

LEI0075 

58 

1.5 

241-243 

Pn 

2 

NA 

Z 

LEI0095 

58 

1.5 

240-248 

Pn 

2 

NA 

4 

LEI0098 

58 

1.5 

145-149 

Pi 

2 

158 

14 

LEI0103 

60 

1.5 

247-259 

Pi 

2 

55 

10 

LEI0105 

58 

1.5 

152-156 

Pi 

2 

363 

2 

LEI0106 

58 

1.5 

296-306 

Pi 

3 

410 

1 

LEI0107 

58 

1.5 

224-226 

Pi 

2 

252 

1 

LEI0117 

54 

1.5 

192-208 

Pi 

3 

148 

2 

LEI0130 

58 

1.5 

240-246 

Pi 

2 

183 

9 

LEI0132 

58 

1.5 

228-230 

Pn 

2 

NA 

UN 

LEI0138 

58 

1.5 

183-185 

Pi 

2 

86 

1 

LEI0158 

58 

1.5 

92-94 

Pi 

2 

251 

7 

LEI0169 

58 

1.5 

228-230 

Pi 

2 

79 

1 

LE0I358 

56 

1.5 

162-180 

Pi 

4 

125 

UN 

MCW0019 

54 

1.5 

102-108 

Pi 

4 

155 

1 

MCW0031 

54 

1.5 

246-285 

Pn 

2 

NA 

15 

MCW0034 d 

56 

1.5 

100-102 

Pi 

3 

265 

2 

MCW0035 

56 

1.5 

218-220 

Pi 

2 

308 

10 

MCW0036 

54 

1.5 

162-170 

Pi 

3 

355 

1 

MCW0075 

58 

1.5 

182-190 

Pi 

2 

85 

24 

MCW0080 

54 

1.5 

285-288 

Pi 

2 

256 

15 

MCW0090 

54 

4.5 

85-87 

Pn 

2 

NA 

5 

MCW0093 

56 

1.5 

240-270 

Pi 

5 

357 

3 

MCW0098 

56 

1.5 

225-243 

Pi 

3 

213 

4 

MCW0130 

58 

1.5 

240-242 

Pi 

2 

42 

UN 

MCW0135 

56 

1.5 

116-122 

Pi 

2 

310 

9 

MCW0141 

58 

1.5 

96-99 

Pi 

2 

226 

3 

MCW0160 

56 

1.5 

200-213 

Pi 

2 

103 

8 

MCW0167 

56 

1.5 

120-122 

Pi 

2 

159 

4 

MCW0212 

54 

1.5 

160-162 

Pi 

2 

261 

3 

MCW0230 

56 

1.5 

275-277 

Pi 

2 

41 

11 

MCW0234 

56 

1.5 

275-285 

Pi 

3 

259 

2 

MCW0239 

54 

4.5 

140-142 

Pi 

2 

192 

2 

MCW0242 

56 

1.5 

132-133 

Pi 

2 

170 

UN 

MCW0244 

56 

1.5 

101-102 

Pi 

2 

124 

13 

MCW0250 

56 

1.5 

220-230 

Pi 

3 

314 

6 

MCW0267 

54 

1.5 

263-268 

Pi 

3 

203 

9 

MCW0302 

58 

1.5 

140-146 

Pi 

2 

355 

4 

MCW0304 

58 

1.5 

271-281 

Pi 

3 

170 

UN 

MCW0323 

50 

4.5 

116-119 

Pi 

2 

255 

15 

MCW0351 

58 

1.5 

143-159 

Pi 

2 

368 

8 

ROS0313 

58 

1.5 

220-222 

Pi 

2 

155 

UN 

ROS0316 

58 

1.5 

110-111 

Pi 

2 

86 

13 
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Table 3 (continued) 


Marker 

Annealing 
temperature (°C) 

[MgCl 2 ] 

Amplicon size 
(bp) 

Result b 

No. alleles 

Informative 

meioses 

Chicken 

chrnT 

ROS0321 

56 

1.5 

226-240 

Pi 

3 

38 

2 

ROS0326 

58 

1.5 

218-220 

Pn 

2 

NA 

UN 

ROS0339 

58 

1.5 

282-320 

Pn 

2 

NA 

1 

ROS0341 

58 

1.5 

145-156 

Pn 

2 

NA 

UN 

ROS0346 

58 

1.5 

216-224 

Pi 

2 

126 

UN 

ROS0348 

58 

1.5 

200-210 

Pi 

2 

93 

15 

ROS0354 

58 

1.5 

162-164 

Pi 

2 

44 

UN 

ROS0356 

58 

1.5 

174-178 

Pi 

2 

208 

UN 

UMA2.032 

52 

1.5 

147-149 

Pi 

2 

255 

2 

UMA2.080 

56 

1.5 

147-157 

Pi 

2 

208 

2 

UMA2.123 

54 

1.5 

166-228 

Pi 

4 

113 

2 

UMA2.195 

52 

1.5 

91-93 

Pi 

2 

44 

2 

Quail 

GUJ0008 

60 

1.5 

165-166 

Pn 

2 

NA 

UN 

GUJ0010 

60 

1.5 

144-146 

Pi 

2 

78 

UN 

GUJ0013 

60 

1.5 

136-138 

Pi 

2 

213 

UN 

GUJ0034 

56 

1.5 

214-215 

Pn 

2 

NA 

UN 

GUJ0041 

56 

1.5 

107-111 

Pn 

2 

NA 

UN 

GUJ0042 

60 

1.5 

197-199 

Pn 

2 

NA 

UN 

Turkey 

RHT0003 

54 

1.5 

215-217 

Pi 

2 

278 

UN 

RHT0010 

56 

1.5 

211-219 

Pi 

2 

84 

UN 

RHT0011 

58 

1.5 

135-157 

Pi 

4 

129 

UN 

TUM02 

58 

1.5 

160-162 

Pi 

2 

79 

UN 

TUM06 

58 

1.5 

142-144 

Pi 

2 

85 

UN 

TUMI l d 

60 

1.5 

198-202 

Pi 

2 

43 

UN 

TUM 12 

58 

1.5 

212-214 

Pi 

2 

126 

UN 

TUM 17 d 

58 

1.5 

185-191 

Pn 

2 

NA 

UN 

TUM18 d 

58 

1.5 

252-270 

Pi 

3 

208 

UN 

TUM20 d 

60 

1.5 

192-206 

Pi 

2 

342 

UN 

TUM22 

56 

1.5 

132-144 

Pi 

2 

86 

UN 

TUM23 

58 

1.5 

154-158 

Pi 

2 

41 

UN 

TUM25 d 

56 

1.5 

176-186 

Pi 

3 

163 

UN 

TUM32 d 

60 

1.5 

193-197 

Pn 

2 

NA 

UN 

TUM48 d 

58 

1.5 

209-211 

Pi 

2 

86 

UN 

TUM49 d 

58 

1.5 

181-183 

Pi 

2 

256 

UN 

WT-38 

60 

1.5 

116-122 

Pn 

2 

NA 

UN 

WT-54 

60 

1.5 

157-162 

Pi 

2 

84 

UN 

WT-75 

60 

1.5 

277-291 

Pi 

2 

39 

UN 

WT-77 

60 

1.5 

163-173 

Pn 

3 

NA 

UN 

WT-83 

60 

1.5 

181-183 

Pi 

2 

289 

UN 

WT-90 

60 

1.5 

252-258 

Pi 

2 

340 

UN 


Specific PCR parameters (annealing temperature [°C], and MgCl 2 concentration [mM]) for amplifying the 
fragments for acrylamide gel analysis are given for each marker. Fragment sizes and number of alleles are those 
determined in the ten individuals tested. 

b Markers are indicated as either Pi (polymorphic and informative) or Pn (polymorphic and either non- 
informative or not able to be unambiguously scored). 
c Chicken chromosome assignments follow Schmid et al. (2000). 
d Denote markers for which primers were redesigned for amplification. 


per marker ranged from 1-5 with an average of 1.30. Of the 81 
polymorphic chicken markers, alleles of 66 markers could be 
unambiguously scored and were informative in the UMN/ 
NTBF resource population (Table 3). 

In addition to the chicken markers a total of 94 microsatel¬ 
lites isolated from quail and turkey was also tested. Eighteen of 
the 49 quail markers successfully amplified turkey DNA. Allel¬ 
ic polymorphism at these microsatellites was comparable to 
that observed for the chicken markers with 12 of 18 being 
monomorphic and six (33.3%) polymorphic in the resource 
families. The maximum number of alleles observed for the 


quail microsatellites was two (1.33 alleles per marker). Of the 
six polymorphic markers only two were informative in the 
UMN/NTBF resource population (Table 3). The 45 turkey 
microsatellite markers included six (designated WT) reported 
by Latch et al. (2002), 34 (designated TUM) reported by Huang 
et al. (1999) and four (designated RHT) reported by Donoghue 
et al. (1999). All WT primer pairs successfully amplified turkey 
DNA whereas positive amplification was obtained for 27 TUM 
markers (17 of these with redesigned primers) and three of the 
four RHT primers. Allelic polymorphism at the 36 turkey 
markers was found to be higher than that observed for other 
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Table 4. Pairwise linkage statistics (sex-averaged) determined with Lo- 
cusmap. Matched linkages involve pairs of chicken markers from the same 
linkage group. Unmatched linkages occur between pairs of markers from dif¬ 
ferent chicken linkage groups. UN denotes markers for which the corre¬ 
sponding chicken linkage group is unknown. 


Marker 

GGA 

Marker 

GGA 

Theta 

LOD 

Matched linkages 

LEI0106 

1 

LEI0107 

1 

0.094 

37.27 

LEI0106 

1 

MCW0036 

1 

0.330 

7.90 

LEI0107 

1 

MCW0036 

1 

0.303 

7.51 

LEI0169 

1 

MCW0036 

1 

0.038 

18.24 

LEI0105 

2 

MCW0234 

2 

0.337 

6.25 

LEI0105 

2 

UMA2.080 

2 

0.356 

6.72 

MCW0034 

2 

UMA2.080 

2 

0.272 

5.06 

MCW0034 

2 

UMA2.195 

2 

0.048 

4.57 

MCW0234 

2 

UMA2.080 

2 

0.275 

4.25 

LEI0043 

3 

MCW0141 

3 

0.056 

7.48 

MCW0093 

3 

MCW0212 

3 

0.206 

12.02 

ADL0266 

4 

MCW0302 

4 

0.296 

4.34 

ADL0142 

6 

ADL0320 

6 

0.220 

17.50 

ADL0142 

6 

ADL0377 

6 

0.041 

52.18 

ADL0142 

6 

MCW0250 

6 

0.175 

22.86 

ADL0320 

6 

ADL0377 

6 

0.136 

18.07 

ADL0320 

6 

MCW0250 

6 

0.029 

25.11 

ADL0377 

6 

MCW0250 

6 

0.120 

23.09 

ADL0107 

7 

ADL0315 

7 

0.214 

3.16 

ADL0315 

7 

LEI0158 

7 

0.356 

5.12 

LEI0130 

9 

MCW0135 

9 

0.105 

22.19 

MCW0135 

9 

MCW0267 

9 

0.000 

17.76 

ADL0272 

10 

MCW0035 

10 

0.025 

10.01 

ADL0263 

14 

LEI0066 

14 

0.023 

85.49 

ADL0263 

14 

LEI0098 

14 

0.092 

19.49 

LEI0066 

14 

LEI0098 

14 

0.075 

21.88 

MCW0323 

15 

ROS0348 

15 

0.077 

4.76 

MCW0080 

15 

MCW0323 

15 

0.020 

57.23 

MCW0080 

15 

ROS0348 

15 

0.077 

4.76 

Unmatched linkages 

MCW0034 

2 

MCW0244 

13 

0.164 

5.91 

LEI0158 

7 

MCW0141 

3 

0.384 

3.22 

MCW0244 

13 

UMA2.195 

2 

0.093 

7.16 

New linkages 

LEI0138 

1 

ROS0346 

UN 

0.012 

22.04 

MCW0036 

1 

RHT0011 

UN 

0.259 

4.48 

MCW0036 

1 

GUJ0013 

UN 

0.265 

7.54 

LEI0117 

2 

LEI0358 

UN 

0.000 

37.62 

MCW0234 

2 

TUM 18 

UN 

0.176 

16.35 

MCW0239 

2 

RHT0003 

UN 

0.050 

22.81 

ROS0321 

2 

MCW0130 

UN 

0.000 

11.44 

MCW0098 

4 

MCW0242 

UN 

0.050 

16.76 

MCW0167 

4 

ROS0313 

UN 

0.161 

3.38 

MCW0302 

4 

TUM 11 

UN 

0.048 

4.57 

MCW0250 

6 

WT-90 

UN 

0.361 

3.02 

ADL0272 

10 

GUJ0010 

UN 

0.128 

5.25 

LEI0103 

10 

GUJ0010 

UN 

0.128 

5.25 

LEI0066 

14 

ROS0354 

UN 

0.023 

11.17 

ADL0263 

14 

ROS0354 

UN 

0.000 

13.24 

ROS0348 

15 

TUM06 

UN 

0.083 

6.35 

RHT0011 

UN 

GUJ0013 

UN 

0.000 

6.92 

TUM25 

UN 

GUJ0013 

UN 

0.045 

4.86 

TUM25 

UN 

RHT0011 

UN 

0.000 

36.12 

TUM20 

UN 

ROS0356 

UN 

0.033 

29.77 

TUM48 

UN 

TUM20 

UN 

0.163 

4.65 

WT-54 

UN 

TUM22 

UN 

0.000 

25.28 


avian markers. Fourteen markers (38.8%) were monomorphic 
and 22 (61.1 %) were polymorphic. The number of alleles 
ranged from 1-4 with an average of 1.75 alleles per marker. Of 
the 22 polymorphic microsatellites, 18 were informative in the 
UMN/NTBF resource population. 


Linkage analysis 

The number of informative meioses for the 86 avian markers 
genotyped ranged from 38 to 410 with an average of 182.9 (sex 
combined, Table 3). The genotype dataset included a single non¬ 
inheritance error. One F 2 in the 3804 sire-family had the geno¬ 
type of 156/156 at the ADL0317 marker with sire and dam geno¬ 
types of 152/152 and 152/156, respectively. A total of 54 signifi¬ 
cant (LOD > 3.00) pairwise linkages involving 60 of the 86 
markers were identified (Table 4) and 26 markers remain un¬ 
linked. Sex-averaged LOD scores ranged from 3.02 to 85.49 with 
an average of 15.7 in pairwise comparisons (Table 4). An addi¬ 
tional six pairwise linkages were found with LOD scores be¬ 
tween 2.0 and 3.0. These included linkage between MCW0230 
[GGA1] and TUM49 (LOD = 2.94, 0 = 0.105), TUMI8 and 
UMA2.032 [GGA2] (LOD = 2.81 0 = 0.372), and TUM02 and 
TUM06 (LOD = 2.75,0 = 0.216). Two markers (MCW0093 and 
ADL0306) that are approximately 40 cM apart on GGA3, were 
linked in the turkey at LOD = 2.11 (0 = 0.277). 

Of the significant linkages identified in turkey (LOD > 3.0, 
Table 4), 29 occurred between pairs of markers known to 
belong to the same linkage group in the chicken (designated as 
“matched linkages”). In addition, three linkages were found 
between pairs of markers known to be from different linkage 
groups in the chicken (designated as “unmatched linkages”). 
Interestingly, two of these unmatched linkages include markers 
from chicken chromosome 2 (GGA2). GGA2 is believed to be 
represented by two chromosomes in the turkey - fission/fusion 
event (Schmid et al., 2000). In addition to the matched and 
unmatched linkages, a total of 22 new linkages were found (Ta¬ 
ble 4). All linkages involving the quail (GUJ) and turkey mark¬ 
ers (RHT, TUM and WT) fall into this group. Of these, 16 
involve at least one chicken marker with a known position on 
the chicken linkage map. Marker pairs ADL0263-ROS0354 
and GUJ0013-RHT0011 were identically inherited. 

In considering the matched linkages, pairwise genetic dis¬ 
tances indicate marker spacing for chicken and turkey were 
similar in most cases. For example, pairwise genetic distance 
between the GGA1 markers MCW0036 and LEI0106 in the 
turkey was 33.0 cM and the distance between LEI0106 and 
LEI0107 was 9.4 cM. Genetic distances on the Wageningen 
sex-averaged chicken map (Groenen et al., 2000) for these 
marker pairs are 38.6 cM and 2.5 cM, respectively (values from 
the ARKdb database). Similarly, genetic distance between the 
GGA3 markers MCW0093 and MCW0212 in the turkey was 
20.6 cM, which is comparable to the 29.6 observed on the 
Wageningen sex-averaged map. 

A total of 18 linkage groups were identified in the turkey 
(Table 5) with all but two assignable to a chicken chromosome. 
Linkage group assignments were made for seven previously 
unlinked chicken markers and both of the informative quail 
markers. Ten of the eighteen informative turkey markers were 
included in the linkage groups. Determining marker order can 
be problematic with relatively small linkage groups. However, 
examination of pairwise distances suggests that marker order is 
generally conserved between chicken and turkey. For example, 
one turkey linkage group contains four markers from GGA6 
that shows identical marker order and similar marker intervals 
as seen on the chicken map (Fig. 2). 
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Fig. 2. Linkage relationships of chicken microsatellite markers in the turkey as determined from genotypes of the UMN/ 
NTBF resource population. Linkage group is aligned with chicken chromosome 6 (GGA6, URL http://www.zod.wau.nt/vf/ 
research/chicken/images/gga6 .j pg). 


Table 5. Linkage groups of avian microsatel¬ 
lite markers as determined in the turkey by Locus- 
map. The corresponding chicken chromosome 
(GGA) is given for those linkage groups contain¬ 
ing at least one marker mapped to that chicken 
chromosome. The best marker order is given for 
linkage groups with three or more markers. Mark¬ 
ers that are reported as unlinked in the chicken 
are in bold and markers developed from quail and 
turkey are underlined 


Chicken chrm Markers 


GGA1 MCW0036, LEI0169, TUM25 , GUJ0013 (= RHT0011) , LEI0107, LEI0106 

GGA1 LEI0138, ROS0346 

GGA2 LEIO105, TUMI8 , UMA2.080, MCW0234 

GGA2 ROS0321, MCW0130 

GGA2 LEIO 117, LEI0358 

GGA2 MCW0239, RHT0003 

GGA3 MCW0093, MCW0212 

GGA4 ADL0266, MCW0302, TUM11 

GGA4 MCW0098, MCW0242 

GGA4 MCW0167, ROS0313 

GGA6 ADL0142, ADL0377, ADL0320, MCW0250, WT-90 

GGA7 ADLO107, ADL0315 

GGA9 LEIO 130, MCW0267, MCW0135 

GGA 10 ADL0272, LEI0103, MCW0035, GUJ0010 , (markers could not be ordered) 

GGA 14 ADL0263 (=ROS0354), LEI0066, LEI0098 

GGA 15 TUM06 , MCW0080, ROS0348, MCW0323 

NA TUM20 , ROS0356, TUM48 

NA TUM22 , WT-54 


Discussion 

Value of markers from other species 

Several recent studies have examined the usefulness of 
microsatellite markers for cross-species comparisons in such 
diverse taxa as fish (Malloy et ah, 2000; Leclerc et ah, 2000), 
birds (Dawson et al., 2000; Nesje and Roed, 2000), and pri¬ 


mates (Kayser et ah, 1996). Often only markers that were poly¬ 
morphic in the species of origin are tested for cross-species 
amplification. The conservation of microsatellites between re¬ 
lated bird species also appears significant given the relative suc¬ 
cess of amplification and comparative sequence data (Cheng et 
al., 1995; Fields and Scribner, 1997; Hanotte et al., 1997; Prim¬ 
mer et al., 1997; Petren, 1998; Reed et al., 2000a). Currently, 
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interspecies comparative mapping of avian taxa is not possible 
since genetic maps are not available for other species. At the 
gross karyological level, diverse avian species appear to share 
significant genome homology based on comparative chromo¬ 
some painting (Shetty et ah, 1999) and this certainly is the case 
between the chicken and turkey (Burt, 2000). The continued 
development of genetic markers for both wild and agricultural 
bird species, plus the increasing availability of DNA sequence 
data will provide important tools for examining the structure 
and evolution of avian genomes. 

Construction of genetic linkage maps must overcome the 
large number of microchromosomes found in many avian spe¬ 
cies that significantly increases the number of linkage groups 
relative to genome size. Because of this, the number of markers 
required to identify all of the linkage groups is increased espe¬ 
cially for very small chromosomes that are subject to fewer 
recombination events per meiosis. Thus, if the degree of DNA 
sequence variation is sufficiently low to allow for primer 
annealing, the large number of genetically mapped chicken 
microsatellites provides a significant marker resource for com¬ 
parative genetic mapping in other avian taxa such as the turkey. 
The present study provides the most comprehensive survey of 
avian microsatellites in the turkey. With the addition of the 
remaining three chicken microsatellite primer sets (obtained 
from the US Poultry Genome Coordinator), the quail and tur¬ 
key primers, 772 markers have now been surveyed with an 
amplification rate of nearly 50% (410 of 772 markers). 

The chicken markers examined in this study are distributed 
throughout the chicken genetic map. For example, 59 of the 
356 amplified chicken markers fall on chromosome 1 and sev¬ 
en of these are informative in our resource families. Likewise 
the set of amplified markers includes 48 from chromosome 2 
(11 informative), 21 from chromosome 3 (6 informative) and 
21 from chromosome 4 (4 informative). The highly developed 
map of the chicken (Schmid et al., 2000) will provide a frame¬ 
work for aligning the turkey linkage groups and testing hypoth¬ 
eses of genome organization and chromosomal rearrangement 
between these species. 

The value of microsatellites in cross-species comparisons is 
based not only on the degree to which they amplify in other 
species but ultimately on the level of polymorphism observed 
in the study animals. Approximately 12% of the chicken mark¬ 
ers were polymorphic in the turkey. Based on the linkage rela¬ 
tionships established in the present study, marker order and 
intervals in the turkey may be very similar to those seen in the 
chicken. Genomes of the turkey (2n = 80) and the chicken (2n = 
78) include a small number of macrochromosomes, many 
microchromosomes, and the Z/W sex chromosomes. A major 
karyotype difference between these two species is that turkey 
chromosomes 3 and 6 are believed to represent a centric fission 
of GGA2. In addition to the markers identified in the present 
study, a bank of turkey microsatellites is now available for map 
construction (Reed et al., 2000b, 2002, 2004, unpublished; 
Dranchak et al., 2003). As additional microsatellite markers are 
genotyped on the UMN/NTBF population we anticipate that 
the multiple linkage groups identified in the turkey correspond¬ 
ing to GGA1, GGA2 and GGA4 will coalesce. 


Although the majority of the polymorphic chicken markers 
(66 of 81) were informative in the new UMN/NTBF resource 
population, this represents only 9.7 % of the 678 markers exam¬ 
ined. The utility of these markers however, will not be limited 
to those that show microsatellite length variation. For example, 
monomorphic markers can also be examined for DNA se¬ 
quence variation in the flanking regions (ie. single nucleotide 
polymorphisms, SNPs). Sequencing of the chicken genome is 
currently underway at the Genome Sequencing Center, Wash¬ 
ington University. Completion of the chicken sequence will 
provide a wealth of potential resources for turkey genomics and 
comparative gene mapping. Reed et al. (2000) found that DNA 
fragments obtained from turkeys using chicken microsatellite 
primers, corresponded to the chicken sequence for that marker, 
even for markers that did not have an appreciable microsatel¬ 
lite repeat. It should be possible to use the chicken genome 
sequence to design PCR primers for focused SNP development 
in the turkey. Assays such as dHPLC analysis, denaturing gra¬ 
dient gel electrophoresis (DGGE), single strand conformation 
polymorphism (SSCP) and other emerging technologies could 
then be used to place SNPs on the turkey map. 

Utility of the UMN/NTBF panel 

It is our goal to have the UMN/NTBF resource population 
serve as a mapping standard for the turkey. The UMN/NTBF 
panel will allow researchers to map microsatellites, SNPs, spe¬ 
cific genes, ESTs and other genetic markers. Because this panel 
represents a limited resource, investigators will be asked to fol¬ 
low a set of guidelines that have been established for its distri¬ 
bution. In order to obtain DNA from the UMN/NTBF popula¬ 
tion, investigators should contact the senior author (K.M. 
Reed). Distribution of the full UMN/NTBF panel will be in 
aliquots of DNA from each of the 192 individuals in two, 96- 
well plates. A Material Transfer Agreement (available from 
K.M. Reed) must be completed prior to the shipment of UMN/ 
NTBF DNA. 
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Myosin light chain genes in the turkey 
(Meleagris gal I op a vo) 
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Department of Veterinary Pathobiology and Animal Biotechnology Center, University of Minnesota, St Paul MN (USA) 


Abstract. Myosin light chains associate with the motor pro¬ 
tein myosin and are believed to play a role in the regulation of 
its actin-based ATPase activity. Myosin light chain cDNA 
clones from the turkey (Meleagris gallopavo) were isolated and 
sequenced. One sequence corresponded to an alternative trans¬ 
cript, the skeletal muscle essential light chain (MYL1 isoform 
1) and a second to the smooth muscle isoform of myosin light 
chain (MYL6). The DNA and predicted amino acid sequences 
of both light chain genes were compared to that of the chicken. 
Based on the cDNA sequence, oligonucleotide primers were 
designed to amplify genomic DNA from six of the seven 


introns of the MYL1 gene. Approximately 5 kb of DNA was 
sequenced (introns and 3' UTR) and evaluated for the presence 
of single nucleotide polymorphisms (SNPs). SNPs were veri¬ 
fied by sequencing common intron regions from multiple indi¬ 
viduals and three polymorphisms were used to genotype pedi¬ 
greed families. MYL1 is assigned to a turkey linkage group that 
corresponds to a region of chicken chromosome 7 (GGA7). The 
results of this study provide genomic reagents for comparative 
studies of avian muscle components and muscle biology. 

Copyright©2003 S. Karger AG, Basel 


The interaction of filamentous actin within the motor pro¬ 
tein myosin forms a chemo-mechanical coupling and is the 
basis of cellular movement in eukaryotic cells. Myosin is a large 
protein comprised of an enzymatic, actin binding head region, 
a tail region that binds other myosin tail regions to form myosin 
filaments, and a central light-chain binding region. During the 
myosin ATPase cycle the myosin heads bind to actin, undergo a 
conformational change and force the sliding of the actin and 
myosin filaments relative to each other. 

While research has shown the involvement of the light 
chains in non-skeletal myosin ATPase activity (Pollenz et al., 
1992; Chen et al., 1995), the role of essential light chains (alkali 
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1-chain, MYL, follows HUGO nomenclature; Wain et al., 2002) 
in skeletal muscle myosin ATPase activity is not yet under¬ 
stood (Wagner and Giniger, 1981; Sivaramakrishnan et al., 
1982). MYL1 has been implicated in providing structural sup¬ 
port to the myosin heavy chain (MHC) neck region, regulation 
of shortening velocity reported in muscle fibers with the slow 
(type 1) myosin heavy chain isoform, and actin/myosin cross- 
linking (Poetter et al., 1996; Andreev et al., 1999; Biral et al., 
1999). By one, or a combination of these mechanisms, MYL1 
effects force production. In-vitro removal of myosin essential 
light chain results in a 50 % reduction of isometric force genera¬ 
tion (VanBuren et al., 1994). 

Two fundamentally different light chains, the regulatory 
and the essential light chains are produced from multiple 
unlinked genes (Kelly and Buckingham, 2000). The essential 
light chain gene family includes MYL1 (fast twitch), MYL3 
(slow twitch, ventricular), MYL4 (atrial, embryonic) and 
MYL6 (smooth muscle). The MYL1 isoforms found in adult 
fast skeletal muscle fibers are produced by alternate splicing of 
the primary transcript (Nabeshima et al., 1984; Barton and 
Buckingham, 1985). One isoform (MYL1 isoform 1) includes 
exons 1, 4, and 5-9, whereas the second (MYL1 isoform 3) 
includes exons 2, 3, and 5-9. During characterization of a tur- 
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key cDNA library (24-day turkey embryo; Harry et al., 2003), 
we noted that the partial sequence of one clone (Nte0700) 
showed high similarity to the sequence of the myosin essential 
1-chain al mRNA. Based on the partial sequence and the length 
of the cDNA insert (~ 950 bp), this clone appeared to represent 
the full-length cDNA of the 1, 4, 5-9 isoform. 

This study was designed to further investigate MYL genes in 
the turkey (Meleagris gallopavo) by determining the entire 
expressed sequence (cDNA clones) and selected intervening 
sequences in the genome. Partial sequences from multiple indi¬ 
viduals were compared to identify polymorphisms, e.g., single 
nucleotide polymorphisms, insertions, and deletions for use in 
genetic linkage mapping. 


Materials and methods 

cDNA amplification and sequencing: 

The cDNA clone (Nte0700) was subjected to PCR to amplify the insert 
for DNA sequencing. The reaction contained as template 1 pi of eluted phage 
stock, 1.5 mM MgCl 2 , 2.5 pmol each primer (T3 and T7 flanking primers or 
insert-specific primers), 100 pM dNTP, and 0.35 U Taq polymerase (Qiagen, 
Inc). Amplifications were performed in a Techne thermal cycler under the 
following reaction conditions: 15 min at 94 ° C; 35 cycles of 30 s at 94 0 C, 30 s 
at 57 0 C, 1 min at 72 ° C; and a final extension of 5 min at 72 ° C. PCR prod¬ 
ucts were resolved on 1 % agarose gel. Sequencing template was prepared 
from the PCR reaction product with the QIAquick PCR purification kit 
(Qiagen, Inc) and analyzed on an automated DNA sequencer with vector- 
specific primers. DNA sequences were edited and contigs constructed with 
Sequencher (Gene Codes Corp.). 

Intron amplification and sequencing 

With the exception of introns A and D, the exons of the MYL1 gene in 
the chicken are separated by introns averaging 685 bp (248 to 1522, Nabeshi- 
ma et al., 1984) suggesting they would be readily accessible to molecular anal¬ 
ysis in the turkey. Primers for amplifying introns were designed using the 
cDNA sequence from the Nte0700 clone and the previously reported chicken 
MYL1 sequence. Introns were amplified by PCR from either pooled genomic 
DNA (introns B and E) or from ten of the founding individuals of the UMN/ 
NTBF resource population families (Reed et al., 2003). Founding individu¬ 
als included the four F t females, two F! males plus the two grandsires and 
two granddams of the Fj males. PCR reactions contained approximately 
100 ng genomic DNA, 1.5 mM MgCl 2 , 22 pmol each primer, 100 pM dNTP, 
and 1 U Taq polymerase. Primer annealing conditions were optimized for 
the amplification of each intron in a Techne thermal cycler under the follow¬ 
ing conditions: 15 min at 94 °C; 35 cycles of 30 s at 94°C, 30 s at optimal 
annealing temperature, 1 min/kb at 72 °C; and a final extension of 5 min at 
72 °C. Amplified products were resolved on a 1 % agarose gel, purified for 
sequencing template with the QIAquick PCR purification kit or QIAquick 
Gel Extraction kit (Qiagen Inc) and sequenced on an ABI automated 
sequencer. Internal sequencing primers were constructed to resolve larger 
intron sequences. DNA sequences were edited and contigs assembled with 
Sequencher (Gene Codes Corp). 

Genotyping and linkage analysis 

Two turkey resource populations were used to examine the inheritance of 
the MYL1 sequence polymorphisms. These included the original NTE map¬ 
ping population used to create the cDNA/RFLP map (Harry et al., 2003) and 
the new UMN/NTBF resource population (Reed et al., 2003). Different opti¬ 
mal genotyping strategies were employed for the three polymorphisms. For 
the 6-bp indel (FI47), DNA fragments were amplified and labeled for elec¬ 
trophoresis by substituting 33 P-dATP (0.3 pmol) in the PCR reaction using 
an internal primer combination to give a 200 bp product (F147F-TC- 
T GGGAAAACT GAT GCAAAG, F147R-T GC AT GT GC AT GT AG AC AT - 
AGG). PCR products were denatured at 94 °C and electrophoresed through 
5 % acrylamide gels. After autoradiography, allele sizes were determined by 
comparing amplified fragments to size markers (M13 sequencing reaction) 
and genotypes were manually scored. 


Table 1. Oligonucleotide primers utilized for amplification and/or se¬ 
quencing genomic DNA. EXON 2F(CHIX) is entirely based on the sequence 
of chicken exon 2 (GenBank ace. no. K02609) 


Primer 


Sequence (5’-3’) 


EXON 2F(CHIX) 
EXON 3R 
EXON 4R 
INTRON DF(4F) 
INTRON DR(5R) 
700 INT (5F) 
INTRON ER1 
INTRON ER2 
EXON 6F 
EXON 6R 
700 3'F (7F) 
EXON 7R 
EXON 8R 
EXON 8F 
EXON 9R 
700 TAIL F 
700 TAIL R 


TTT CC AACT CT C A ATC AT GGT G 
TGATTTGGTCAGCTGTGAAAG 
CTGCTCCTTGGAGAACTCG 
GATTTCCTTCCAACAGATCGAG 
GGTAATCTTGGCATCACCAGTC 
AGAACCCCACAAAT GCT GAG 
GGGCAAGACGAGTGAGATT C 
A AGG AGC AGCTTT C ATCT GG 
A AGGT CT GCGT GTTTTCGAC 
GGGC AGG A ACT CTT C A AAGG 
A ACGGCT GC AT C A ACT ACG 
AGCCGTT GG AGT CTT CCT G 
AT CT GGGG AT GTCCGCTTAG 
AGCGGACATCCCCAGATAAC 
TGCCAGAGCAAGAATTTCAC 
TT GGGT C ACTTCC A AA A ACT G 
TT CTTT CAT GGG A A ACC AC AT G 


For the SNP in intron C (Nte0700-snpl = C70), SNP assays were devel¬ 
oped for analysis on the Beckman CEQ 8000 Genetic Analysis System using 
the CEQ SNP primer extension kit. The SNP kit employs a specifically 
designed oligonucleotide primer to anneal one base short of the target SNP 
and adds a complementary fluorescent-tagged dideoxy nucleotide terminator 
to the primer 3' end according to the template SNP site. Samples were run 
with the internal Beckman SNP size standard and alleles were manually 
checked after automated scoring. The SNP in intron H (Nte0700-snp2 = 
H276) occurred within the recognition sequence GATC for the restriction 
endonuclease Mbol. For this SNP, PCR was used to amplify the SNP-con- 
taining fragment from genomic DNA and the products were digested with 
Mbol. 

The resulting DNA fragments were electrophoresed through 2 % agarose 
gel and individual samples were scored according to banding pattern (PCR/ 
RFLP). Genotypic data were analyzed with Locusmap software (Garbe and 
Da, 2003). 

Library Screening 

In an attempt to isolate a clone containing the MYL1 isoform 3, the 
cDNA library was screened on charged nylon filters. The 3' end of the cDNA 
clone Nte0700 was amplified for use as template (primers 700int AGA- 
ACCCCAC AAAT GCT GAG and Exon 9R TGCCAGAGCAAGAATTT¬ 
CAC); and a random primer labeled probe was generated using 32 P-labeled 
dATP and the Prime a Gene kit (Promega, Corp). Filters were pre-hybrid- 
ized with 8 ml Rapidhyb solution (Amersham, Corp) for 15 min at 65 °C. 
5 x 10 6 cpm probe was added and allowed to hybridize for 3 h at 65°C. 
Filters were washed at low stringency in 2x SSC, 0.1 % SDS at 60 0 C followed 
by a high stringency wash of 0.1 x SSC, 0.1 % SDS at 60 0 C. Positive plaques 
were cored, replated at lower density and secondarily screened using the 
same protocol. Positive secondary plaques were picked and placed in 20 pi 
deionized H 2 0. Inserts were amplified by PCR using T3 and T7 vector prim¬ 
ers using the same protocol as above. Inserts were purified with QIAquick 
columns and sequenced with an ABI automated sequencer. 


Results 

MYL1 iso form 1 

The contiguous cDNA sequence of the clone Nte0700 repre¬ 
senting the MYL1 isoform 1 (exons 1,4-9) was 954 bp of which 
374 bp encodes for 192 amino acids and the stop codon (Fig. 1). 
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Additional library screening and subsequent DNA sequencing 
has increased the total sequence information for the turkey 
MYL1. The 5' untranslated region includes 124 bases. The ini¬ 
tial y untranslated region consisted of 358 bp pre-poly-A tail; 
however, a clone with an extended 3' UTR (111 bp) and an 
alternate poly adenylation site was uncovered through library 
screening. The turkey and chicken sequences display 96.9% 
overall similarity. There are 18 nucleotide differences, one first 
position, two second position, and 15 in the third position. 
These changes resulted in four predicted amino acid differ¬ 
ences (98% identical) that are mostly conservative changes: 
Ser46Phe, Lys62Arg, Met89Ile, and Tyrl27Phe. 

Known chicken exon/intron boundaries were used as guide¬ 
lines for oligonucleotide primer design in intron amplification 
(Table 2, Fig. 2). Six complete introns of the MYL1 gene were 
amplified and sequenced. Intron D was sequenced from exon 
boundaries leaving ~2 kb undetermined. As expected the 
introns of MYL1 displayed greater sequence variation between 
species than the exons (Table 2). The percent similarity ranged 
from 83.5 to 93.5% with the largest sequenced intron showing 
the greatest proportion of differences and the shortest being the 
most conserved. The GC content among completely sequenced 
introns varied between 36.6 and 45.6%. The average percent 
GC was 40.8% within introns and 46.8% within the cDNA 
(MYL1 isoform 1). 

MY LI isoform 3 

The cDNA library was extensively screened for lambda 
clones containing exons 1 and 2 to increase the sequence knowl¬ 
edge and to reaffirm the single gene-alternate splicing hypothe¬ 
sis set by Nabeshima et al. (1984). No clone corresponding to 
MYL1 isoform 3 was identified; however, many alternative 


Fig. 1 . Aligned cDNA sequences of turkey (GenBank ace. no. AY310746) 
and chicken (GenBank ace. no. K02608 and K02610) myosin essential light 
chain gene (MYL1 isoform 1). Corresponding amino acids are given below 
the nucleotide sequence and the positions of exons are denoted above. Ami¬ 
no acid residues that differ between the turkey, chicken are denoted by bold¬ 
face type. 


Table 2. Comparison of non-coding regions of the MYL1 gene of the turkey and chicken. The number of total nucleotides for each species, and the percent 
sequence similarity are given for each region. The 208-bp insertion/deletion that occurs between the two species in intron H was counted as a single mutation. 
Comparisons for intron D were not made because the full-length intron of the turkey was not sequenced. 3' UTR includes only comparable post-intron H 
sequence. G+C content refers only to non-coding bases of the turkey sequence. 


Region 

Turkey bp 

Chicken bp 

Number of 
differences 

Differences/bp 

Sequence 

Similarity 

G+C 

Content 

Primers 

Annealing 
Temp °C 

Product 

Size (bp) 

Intron B 

614 

612 

83 

0.135 

86.5% 

36.6% 

EXON 2F(CHIX) - EXON 3R 

58 

659 

Intron C 

248 

245 

16 

0.065 

93.5% 

45.6% 

EXON 2F(CHIX) - EXON 4R 

58 

931 

Intron D 
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INTRON DF(4F) -INTRON DR(5R) 

58 
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Intron E 

1522 

1514 
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82.5% 

41.5% 

700 INT (5F)- EXON 6R 

58 
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70 
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85.1% 

41.8% 

EXON 6F - EXON 7R 

60 
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Intron G 
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74 
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85.7% 

42.8% 

700 3’F (7F) - EXON 8R 

58 
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Intron H 

584 
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73 
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87.5% 

39.3% 
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58 

671 
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40.9% 
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Fig. 2. Idiogram of the MYL1 gene. Position 
of exons (above) and introns (below) are as indi¬ 
cated. Position and GenBank accession number 
for intron fragments amplified from genomic 
DNA are indicated below. 
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MYL1 isoform 1 clones were found. These partial length clones 
varied in size and included one clone with an extended 3' end. 
The sequence of this clone continued past the polyadenylation 
recognition site to a second site 111 bases downstream. 

In an attempt to resolve some of the intervening sequence 
between exons 1 and 4 of the MYL1 in turkey, PCR primers 
were designed utilizing known chicken sequence to amplify 
introns B and C. The predicted sequence of exon 3 in the turkey 
(TCTTTCACAGCTGACCAAATCAATG) diverges from that 
of the chicken (TCCTTCTCACCTGACGAAATCAATG) by 
four nucleotides (84% sequence conservation) resulting in 
three amino acid differences, Thr4Ser, Ala5Pro, and Gln7Glu 
(62.5 % amino acid conservation). No sequence data for exon 2 
was determined in the turkey, however chicken and turkey 
must share significant homology since the oligonucleotide 
primer based on chicken exon 2 and flanking introns was used 
to successfully amplify turkey genomic DNA. Nabeshima et al. 
(1984) reported exon 2 in the chicken to consist of a single start 
codon (ATG). Turkey sequence maintains the same reading 
frame. 

Smooth muscle MYL6 

Library screening for MYL1 isoform 3 and subsequent 
DNA sequencing identified a clone (c-374) with high similarity 
to the smooth muscle form of MYL. The complete contiguous 
sequence of this clone (Fig. 3) was 485 bp in length; 455 bp of 
which corresponded to the coding region (151 predicted amino 
aids + stop codon), plus 5 bp of 5' upstream sequence, and 
24 bp of 3' UTR. Sequence divergence within the coding region 
between turkey and chicken was similar to that seen for MYL1 
isoform 1. Contrasting this sequence with the sequence of the 
chicken (Nabeshima et al., 1987, GeneBank acc. no. Ml5646) 
found 17 nucleotide substitutions within the coding region 
resulting in one predicted amino acid difference (96 % sequence 
homology and 99% amino acid conservation). 

SNP identification and verification 

After completing the sequence of the cDNA clones, oligonu¬ 
cleotide primers were designed to amplify seven of the MYL1 
introns: B, C, D, E, F, G, and H, to find SNPs within the ampli¬ 
fied fragments that would be applicable to genotyping (Fig. 2). 
Four introns (C, F, G, and H) were sequenced across multiple 
individuals, DNA sequences were aligned with Sequencher 
software (Gene Codes Corp), and potential SNPs were manual¬ 
ly tagged. The large size of intron D (> 3kb) discounted its com¬ 
plete sequencing. A total of 23 sites were tagged as potential 
SNPs (Table 3) within the 4976 bp sequenced, or approximate¬ 
ly 1 SNP/226 bp. In addition to the SNPs a larger 6-bp inser- 
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Turkey TCCCCACAC-ACACTCACCCNCCCGACCTNNCCTATGATGGATTTGGTGCCTTTT 518 

Chicken C. . . .C.C.A.A.TAAA. . .C 

Turkey GCCCCTTCTCCTCCCCCCACTCATTGCTGATTTCCTCCCCCCCTCCCCACTCAGC 573 
Turkey CCCACGGNTCCATCACTGCTCTGTGCAGCAGCGCCCAATAAAGGCTGCAGCGCGG 62 8 

Fig. 3. Aligned cDNA sequences of turkey (GenBank acc. no. AY310747) 
and chicken (GenBank acc. no. Ml5646) smooth muscle MYL6. Corre¬ 
sponding amino acids are given below the nucleotide sequence and the posi¬ 
tions of exons are denoted above. Amino acid residues that differ between 
the turkey, chicken are denoted by boldface type. 


tion/deletion polymorphism was identified in intron F (FI47). 
Three potential SNPs (C70, E800, and H276, Table 3) and the 
6 bp indel in intron F (FI47) were verified by sequencing a 
representative sample of F 2 individuals from the UMN/NTBF 
mapping families. The SNP sequence genotypes of the F 2 indi¬ 
viduals examined were consistent with Mendelian expectations 
given the known parent-offspring relationships. The 6-bp indel 
in intron F and two SNPs (Nte0700-snpl, C70 and Nte0700- 
snp2, H276) were subsequently chosen for inheritance studies. 
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Table 3. Potential single nucleotide polymorphisms (SNPs) identified in 
MYL1 introns. For each SNP the intron and position, specific polymor¬ 
phism, number of observed heterozygotes and presence of alternate homozy¬ 
gous individuals are given. SNPs in introns B and E were determined from 
sequencing of pooled DNA samples and as such, the number of heterozygotes 
and presence of alternate homozygotes could not be determined. 


Intron/ position 

Polymorphism 

Number of 
heterozygotes 

Alternate 

homozygotes 

B167 

GCT(C/T)AAT 

NA 

NA 

B274 

GCT(C/T)GTT 

NA 

NA 

B278 

GTT(A/T)TAA 

NA 

NA 

B610 

CAA(C/T)TGC 

NA 

NA 

C70 

AGC(G/A)TTC 

3 

Yes 

E48 

AAC(A/T)AAT 

NA 

NA 

E800 

CTA(C/T)CTA 

NA 

NA 

El 072 

TTA(A/T)ATA 

NA 

NA 

E1225 

TCC(C/T)CCA 

NA 

NA 

E1304 

ACT(G/A)TAT 

NA 

NA 

E1470 

ATC(C/T)CAG 

NA 

NA 

F51 

AGG(C/T)ATG 

4 

No 

FI 47 

G(AAAAAG/-)T 

2 

Yes 

FI 72 

AAA(C/T)TCA 

0 

Yes 

G121 

AAG(A/G)TTT 

5 

No 

G258 

TTG(A/G)AAG 

4 

No 

G411 

GTG(C/G)CTT 

2 

No 

G427 

ATC(A/G)GCT 

1 

No 

G434 

CCT(G/C)AAC 

1 

No 

G473 

GTA(C/A)TAT 

3 

No 

H243 

GTA(A/T)TCT 

3 

Yes 

H276 

GGG(A/C)TCA 

4 

Yes 

H399 

AAA(A/C)TGC 

0 

Yes 


Linkage Analysis 

To examine the inheritance of the MYL1 polymorphisms, 
we genotyped the FI47 indel polymorphism, Nte0700-snpl, 
and Nte0700-snp2 on the original NTE mapping population 
(Harry et al., 2003). Linkage analysis included the original 
inheritance data for the Nte0700 locus as well as the new 
MYL1 marker genotypes. The number of informative meioses 
ranged from 82 to 117 (average 98.5). Resultant analyses found 
significant genetic linkage between all four markers as expected 
with an average pairwise LOD score of 18.04. 

To place MYL1 on the new turkey linkage map, the FI47 
indel polymorphism was genotyped as previously described on 
the new UMN/NTBF resource population (Reed et al., in 
review). The FI 47 polymorphism was informative in two of the 
four dam families that comprise the UMN/NTBF population 
(125 informative meioses). Two-point linkage analysis was per¬ 
formed with all previously genotyped markers (Reed et al., 
2003). Significant pairwise linkage relationships were found 
between MYL1 (FI47) and two other markers, ADL0107 and 
MNT-ADL0315 (MNT-ADL0315 corresponds to chicken lo¬ 
cus ADL0315 but utilizes primers designed from turkey DNA 
sequence, Reed et al., 2003). Both of these loci have been map¬ 
ped in the chicken to chromosome 7 (GGA7, Schmid et al., 
2000). The Kosambi mapping option of Locusmap was used to 
translate recombination frequencies into map distances in cen- 
ti-Morgans (cM) and marker order was determined with Locus- 
map. Results indicate a marker order of MYL1-ADL0107- 


MNT-ADL0315 with the distance between MYL1 and 
ADL107 being 5.28 cM (LOD 4.02) and the distance between 
ADL0107 and MNT-ADL0315 being 22.91 cM (LOD 3.16). 
Although there is currently no comprehensive microsatellite- 
based map of the turkey genome, MYL1 is clearly linked to 
genetic markers corresponding to a region of GGA7. 

Discussion 

Muscle research in turkey is generally devoted to production 
quality, with limited, but potential interest in model physiology 
for human disease or comparative biology. As such, muscle 
proteins form the structural backbone by which production 
qualities such as feed conversion, carcass weight, and overall 
meat quality are measured. Therefore, it is of particular interest 
to develop genetic markers for genes associated with both mus¬ 
cle structure and function. SNP identification through intron 
and y UTR sequencing appears to be an efficient method for 
genetic mapping given the limited genomic reagents available 
for the turkey. This study demonstrates that genomic resources 
(cDNA library, mapping populations and linkage map) are now 
available to pursue these goals. 

The level of genetic divergence between the turkey and the 
chicken appears limited, both at the phenotypic and genotypic 
level. At the biological level, chicken and turkey hormones have 
been known to be cross-reactive (Lawson et al., 2001). At the 
molecular level, oligonucleotide primers specific for chicken 
sequence - highly conserved exon sequence and lesser-con- 
served intron sequence - have been employed to amplify tur¬ 
key introns from genomic DNA and chicken microsatellite 
primers can be used with varying effectiveness in the turkey 
(Reed et al., 2000). Sequence data for the myosin light chain 
genes of the turkey appear to support this assertion. The MYL1 
isoform 1 coding region displayed 96.9% identity with 98% 
identity in predicted amino acid sequence. Likewise, the 
smooth muscle MYL6 sequence and amino acid homologies 
were 96% and 99%, respectively. These results coincide with 
other sequence comparisons between the two species (Lawson 
et al., 2001; Kaiser, 2002). MYL1 introns, in spite of relaxed 
selective pressure, maintain a similar, though lesser, extent of 
conservation in both size and sequence. Intron sequence com¬ 
parisons between species revealed an average sequence similar¬ 
ity of 86.8%. 

Substantial sequence differences between the chicken and 
turkey were found in exon 3 of the MYL1 gene. In fact, the two 
flanking introns of this sequence both share a higher degree of 
similarity between chicken and turkey. Within the 25 bp 
sequence of exon 3 there are four nucleotide differences (84% 
similarity) and three predicted amino acid changes (62.5 % con¬ 
servation.). The 41 additional amino acid residues encoded by 
exons 1 and 4 of MYL1 isoform 1 are believed to interact with 
actin and myosin in cross-bridge formation (Andreev et al., 
1999; Timson et al., 1999). The lack of this functional require¬ 
ment on exon 3 may allow for more sequence divergence 
between the species. Clones corresponding to MYL1 isoform 3 
were not identified through screening of the turkey embryonic 
cDNA library. Expression of this isoform has been reported to 
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be up regulated in mammalian fetal tissues (Kelly and Buck¬ 
ingham, 2000). In other species MYL1 isoforms 1 and 3 are 
expressed in fast twitch muscle type in varying ratios, allowing 
for fine-tuning of muscle activity (Seidel et ah, 1988; Hailstones 
et ah, 1990; Rao et ah, 1996). Additional research as to the sig¬ 
nificance of the variation in exon 3 between the chicken and 
turkey is warranted. 

SNPs are the most common form of genetic variation that 
occurs between individual eukaryotic genomes and are being 
developed as the next generation of molecular markers for the 
mapping of complex genetic traits (Vignal et al., 2002). The 
usefulness and potential genetic informativeness of SNPs is 
dependent on their position in the genome, being either associ¬ 
ated with expressed or non-expressed DNA (Kwok et al., 1996). 
SNPs in expressed sequences are increasingly being used for 
fine-scale QTL mapping in humans (Zhoa et al., 1998; Roses, 
2002) and agriculturally important species such as cattle 
(Grosse et al., 1999) and chicken (Smith et al., 1996). Genetic 
markers associated with expressed DNA sequences (expressed 
sequence tags, ESTs) provide additional information beyond 
that of anonymous DNA markers such as microsatellites in that 
they infer the position of genes within the genome. The efficien¬ 
cy of SNP discovery depends to a large degree on the frequency 
of heterozygous positions in an individual being surveyed (H\ 
Fahrenkrug et al., 2002), the size of the region being sequenced, 
and the number of genomes being surveyed. Estimates suggest 
that SNPs occur at a rate of approximately 1 in 1000 bp in the 
human DNA sequence (H = 0.001 per bp, Chakravarti, 1999). 
Recent studies in the chicken estimate the frequency of SNPs at 
one every 470 bp (Smith et al., 2001). In the present study one 
polymorphic base was observed for every 226 bp of turkey 
sequence examined. However, as with the study of Smith et al., 
(2001), this estimate was made using groups of animals, and 
does not represent a value equivalent to H above. Ongoing sur¬ 


veys of 3TJTR sequences in the turkey suggest comparable val¬ 
ues for other genes (Reed, unpublished). 

SNP development in the turkey is part of an expanded effort 
at the University of Minnesota to create a comprehensive 
genetic linkage map of the turkey genome. One aspect of this 
work is the combination the type I markers of the cDNA/RFLP 
“gene” map (Harry et al., 2003) with the emerging microsatel- 
lite-based map (Reed et al., 2003). The current cDNA/RFLP 
linkage map contains 113 loci arranged in 22 linkage groups, 
with another 25 loci that remained unlinked, including MYL1 
(Nte0700). The present study found significant genetic linkage 
between an MYL1 polymorphism and two other genetic mark¬ 
ers developed in the chicken, ADL0107 and ADL0315. Both of 
these chicken markers are mapped in the chicken to GGA7 and 
included in the study of genetic linkage for avian microsatellite 
loci in the turkey (Reed et al., 2003). The structure of the 
genomes of the chicken and turkey are likely to be very similar, 
with at least one major chromosomal difference (Schmid et al., 
2000). Linkage relationships and marker order as determined 
for several chicken markers genotyped in the turkey, including 
ADL0107 and ADL0315, appear to be comparable to those 
observed in the chicken (Harry et al., 2003; Reed et al., 2003). 
Based on these data, MYL1 in the turkey is assigned to a link¬ 
age group corresponding to a region of GGA7. To our knowl¬ 
edge, the MYL1 gene has not been mapped in chicken. How¬ 
ever, we hypothesize that it is located on GGA7 within 10 cM 
of ADL0107. 
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Abstract. Expressed sequence tag (EST) projects have pro¬ 
duced extremely valuable resources for identifying genes affect¬ 
ing phenotypes of interest. A large-scale EST sequencing pro¬ 
ject for rainbow trout was initiated to identify and functionally 
annotate as many unique transcripts as possible. Over 45,000 
5' ESTs were obtained by sequencing clones from a single nor¬ 


malized library constructed using mRNA from six tissues. The 
production of this sequence data and creation of a rainbow 
trout Gene Index eliminating redundancy and providing anno¬ 
tation for these sequences will facilitate research in this spe¬ 
cies. 

Copyright©2003 S. Karger AG, Basel 


Genome research for species of interest is facilitated by the 
development of species-specific tools such as well-character¬ 
ized germplasm, physical and genetic mapping resources, large- 
insert libraries, public bioinformatic databases, and large quan¬ 
tities of sequence information. Interest in the utilization of 
rainbow trout (Oncorhynchus my kiss) as a model species for 
genome-related research activities focusing on carcinogenesis, 
toxicology, comparative immunology, disease ecology, physiol¬ 
ogy, transgenics, evolutionary genetics, and nutrition has been 
well documented by Thorgaard et al. (2002). Coupling great 
interest in this species as a research model with the need for 
genetic improvement for aquaculture justifies the continued 
development of genome resources facilitating selective breed¬ 
ing. Current genomic resources available for rainbow trout 
research include multiple bacterial artificial chromosome 
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(BAC) libraries (Katagiri et al., 2001; Palti, unpublished; Han¬ 
sen, unpublished); clonal lines (Young et al., 1996), and genetic 
maps (Young et al., 1998; Sakamoto et al., 2000; Nichols et al., 
2003). Less than 1,000 rainbow trout ESTs were available on 
the NCBI databases prior to the GenBank release of the ESTs 
described in this paper, which was far behind other fish 
research models like zebrafish and medaka (Thorgaard et al., 
2002). Our concentrated and rapid effort to develop large num¬ 
bers of ESTs will enable comparative mapping with those 
research model fish species and with other vertebrates. 

Two genomic strategies currently employed for the identifi¬ 
cation of novel and previously characterized genes affecting 
phenotypes of interest include the identification of quantitative 
trait loci (QTL; Lander and Botstein, 1989) and high-through- 
put studies of gene expression through microarray technologies 
(Schena et al., 1995). The identification of QTLs in families 
known to segregate genetic variation affecting a phenotype of 
interest requires selection of molecular markers from genetic 
maps and correlation of their genotypes with phenotypic infor¬ 
mation. Results of these whole genome scans reveal one or 
more chromosomal regions harboring genes directly affecting 
this trait. This approach has been successful for rainbow trout 
in localizing many quantitative and qualitative trait loci (Jack- 
son et al., 1998; Danzmann et al., 1999; Sakamoto et al., 1999; 
Nakamura et al., 2001; Ozaki et al., 2001; Perry et al., 2001; 
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Robison et al., 2001). Fine mapping such regions will identify 
genotypes for use in marker assisted selection, and will ulti¬ 
mately pinpoint the exact DNA sequence variation responsi¬ 
ble. Functional genomic approaches are utilized to characterize 
gene expression through experiments involving microarrays or 
other technologies addressing single genes. The identification 
and comparative annotation of species-specific expressed se¬ 
quence tags (ESTs; Adams et al., 1991) facilitates both of these 
strategies by (1) associating transcripts with functional annota¬ 
tions based on comparative information, resulting in species- 
specific sequence data for candidate genes; (2) high-throughput 
identification of novel uniquely expressed transcripts which 
may play roles in specific biological processes; and (3) provid¬ 
ing resources and data that can be used in the construction of 
DNA microarrays. Furthermore, comparison of ESTs with 
genomic sequence from same or similar organisms (pufferfish, 
zebrafish) can be used to develop transcript models for annota¬ 
tion of genes on genomic sequences. 

Our approach for this EST project included the construction 
and 5 7 sequencing of clones from a single normalized cDNA 
library (NCCCWA 1RT) constructed using mRNA from six tis¬ 
sues. Brain, gill, liver, spleen, kidney, and muscle tissues were 
arbitrarily chosen due to their diversity of physiological mecha¬ 
nisms which suggest they should yield a diverse set of trans¬ 
cripts. Through normalization (Soares et al., 1994), the fre¬ 
quency of identifying unique transcripts should increase. For 
the same reason a pooling strategy was also implemented. 5 7 
sequencing was chosen to increase the amount of comparative 
annotation derived from better studied species as 5 7 sequences 
are more likely than 3 7 sequences to extend into coding 
regions. 

As with all salmonids, rainbow trout experienced a recent 
genome duplication event resulting in a semi-tetraploid state 
(Allendorf and Thorgaard, 1984). Instead of maximizing the 
number of individuals contributing to the library and potential¬ 
ly resulting in the identification of single nucleotide polymor¬ 
phisms (SNPs), we minimized the number of individuals to 
reduce allelic variation and simplify the identification of dupli¬ 
cated loci in clustering analyses. The Institute for Genome 
Research’s Gene Indexing (http://www.tigr.org/tdb/tgi) was 
chosen as the best vehicle for functional annotation as they 
have been successful with many other species including aquatic 
model species (medaka and zebrafish) and animal agricultural 
species (cattle, pig, chicken and catfish). Gene Indices contain 
assemblies of ESTs and full-length cDNAs into contigs, com¬ 
parative sequence information, functional annotation associat¬ 
ed with unique sequences, and mapping and expression data 
when available. To facilitate and enhance genome research in 
rainbow trout, an EST project was initiated to (1) identify as 
many unique rainbow trout transcripts as possible, (2) func¬ 
tionally annotate ESTs with comparative genome information, 
and (3) identify and target EST sequences of interest for genetic 
mapping and gene expression studies. Herein, we report partial 
sequences for approximately 25,625 unique transcripts clus¬ 
tered (or assembled) from 47,621 ESTs. Based on the estimated 
number of genes in other sequenced organisms (Adams et al., 
2000; Venter et al., 2001; Aparicio et al., 2002) our data repre¬ 
sent a sizeable portion of rainbow trout genes. 


Materials and methods 

Library construction 

Brain, gill, liver, spleen, kidney, and muscle tissues were dissected from 
three approximately one year old male Kamloop strain rainbow trout and 
flash frozen in liquid nitrogen. A single cDNA library was constructed from 
these tissues by Life Technologies (Invitrogen, www.invitrogen.com) using 
proprietary methods. Briefly, a primary library was constructed by direction- 
ally cloning mRNA from rainbow trout brain, gill, muscle, kidney, spleen, 
and liver tissues using Superscript II RNase IT reverse transcriptase, Elec- 
troMax DH10B cells, and the pCMV-Sport6.0 vector. First strand synthesis 
was evaluated by incorporation of radioactive label. Twenty-three clones 
were picked to determine average insert size and the percentage of clones 
with inserts. The library was amplified using a semi-solid agarose procedure. 
Normalization to Cot 500 or greater was accomplished using proprietary sub¬ 
traction technology. Reduction of the abundant gene marker GAPDH (Gen- 
Bank AF027130) was confirmed by hybridization. To initiate an EST pro¬ 
ject, 60,288 clones were plated and picked into 157 numbered 384-well 
plates and grown in LB am pto create frozen glycerol stocks. To identify trans¬ 
cripts of interest by hybridization, the arrayed clones were gridded in dupli¬ 
cate onto three high-density nylon filters with a maximum of 27,648 clones 
per filter. 

EST sequencing and quality analysis 

Sequencing was attempted on all 60,288 arrayed clones, estimating that a 
success rate between 70-75% would yield -45,000 quality EST sequences. 
384-well format frozen glycerol stocks were re-arrayed into 96-well format 
using a QIAGEN Biorobot 3000 to inoculate 96-well plates with growth 
medium (LB amp ). Aliquots were removed from the overnight cultures (22 h at 
37 ° C) to create frozen glycerol stocks and the remaining culture was used in 
plasmid isolation using the Qiagen Biorobot 3000 and following the QIAG¬ 
EN QIAprep 96 Turbo protocol. DNA quality was assessed using A260/280 
measurements and spot-checking 16 samples per 96 well plate by agarose gel 
electrophoresis. Sequencing reactions using the standard SP6 primer (ATT- 
TAGGTGACACTATAG) were prepared using this information and stan¬ 
dard protocols for ABI Big Dye Terminator chemistry (ABI, Foster City, 
CA). To facilitate integration of sequence data in to databases, .abl sequence 
files were assigned names that contained the following information separated 
by underscores: (1) run date; (2) run number on that day; (3) sequencer ID; 
(4) well position on sequencer; (5) library ID; (6) 384-plate number and well 
position; (7) 96-well format quadrant ID; (8) 96-well plate well position; 
(9) sequencing oligo ID; and (10) capillary number. Quality sequences were 
determined using the criteria of PHRED > 20 (CodonCode Corporation, 
Dedham, MA) over 100 base pairs. Sequences meeting NCBI criteria were 
submitted to the EST database at GenBank. All EST sequences were 
BLASTed (BLASTN, BLASTX) to obtain functional annotation for individ¬ 
ual ESTs, which is not necessarily included in the Gene Index. BLAST (Alt- 
schul et al., 1990) searches were done with an e-value cut-off of le-5, the top 
three hits from each search are saved. Searches were done against a total of 
six databases. Each sequence was BLASTed first against protein and nucleo¬ 
tide databases containing all GenBank sequences from the same species, then 
against databases containing all salmonid sequences, then against the stan¬ 
dard non-redundant protein and nucleotide databases. An effort is made to 
see that the same sequence does not occur multiple times (e.g. when a 
sequence from the trout-only database is also the best hit in the nr database) 
but some duplication does occur. 

Gene indexing 

TIGR Gene Index definitions and protocols have been described pre¬ 
viously (Quackenbush et al., 2001; Pertea et al., 2003) and are available on 
the internet at http://www.tigr.org/tdb/tgi/definitions.html. To summarize, 
all EST data for rainbow trout were extracted from dbEST and subjected to 
quality control screening to eliminate vector and E. coli contamination, low 
complexity sequences, reads less than 100 bp in length, or sequences contain¬ 
ing greater than 3 % ambiguous base calls (Ns). All rainbow trout full-length 
cDNAs and coding sequences downloaded and parsed from GenBank rec¬ 
ords. These NP (for “nuc-prot” sequences) and the cleaned EST were com¬ 
pared pair-wise using MGBLAST, a modified version of the MegaBLAST 
program (Zhang et al. 2000), and placed into clusters based on sequence sim¬ 
ilarity with a minimum requirement of 40 bp overlap and 94% sequence 
identity. Sequences in individual clusters were assembled using Paracel 
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Fig. 1. Quality score ranges for the 61,185 5' 
attempted sequencing reads shown as continuous 
base pairs per sequence meeting the criteria of 
PHRED >20. Sequences meeting this criteria 
and over 100 base pairs in length were submitted 
to GenBank 47,051. 
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Transcript Asssembler (version 2.6.2, http://www.paracel.com), a version of 
the CAP3 program (Huang and Madan, 1999) adapted for EST assembly. 
Tentative Consensus sequences (TCs) defined by assembled contigs of two or 
more sequences. TCs were annotated using known gene sequence content 
where possible and best hits derived from DPS (DNA Protein Search, Huang 
et al., 1997) sequence homology searches of TCs against a non-redundant 
amino acid database (NRAA) at TIGR; for all TCs the top five protein hits 
above cutoff score of 350 are displayed. 

The TIGR Gene Index databases currently represent more than 60 spe¬ 
cies which are used to create EGO, a database of putative Eukaryotic Gene 
Orthologs (Lee et al., 2002). TC and NP sequences from each of the individu¬ 
al Gene Index databases, including rainbow trout, were compared pair-wise 
using blast and sequence-specific best hits with an e-score less than 1 x 10 _1 ° 
are grouped using a transitive reciprocal best hit algorithm to produce Tenta¬ 
tive Ortholog Groups (TOGs). The best match pairs between rainbow trout 
TCs with other related species were extracted from the EGO database. 


Results and discussion 

Library construction 

The primary library consisted of 8.0 x 10 7 colony forming 
units with average insert size of 2.0 kb and 96 % of clones con¬ 
taining inserts of > 200 bp. The normalized library consisted 
of 1.23 x 10 7 colony forming units with average insert size 
1.62 kb. The observed reduction in GAPDH was 85 fold. 

Sequencing analysis 

Quality analysis of 61,185 chromatograms is represented in 
Fig. 1. Seventy-seven percent (47,051) of the starting sequences 
passed the quality filtering and were submitted (October 30, 
2002) to dbEST (ACC nos. CA341551-CA388601). Rejected 
sequences included 13,952 rejected based on poor quality non¬ 
vector sequence with size less than 100 base pairs and 182 were 
rejected based on low complexity. The mean length of the sub¬ 
mitted sequences was 585 base pairs, with a standard deviation 
of 168 and size ranges from 100 to 885. BLAST analysis of in¬ 
dividual ESTs resulted in 54% of the ESTs with a signifi¬ 
cant BLASTN hit and 5 8 % with a significant BLASTX hit at 
the le-5 (Fig. 2). All BLAST results can be obtained at http:// 
ncccwa.ars.usda.gov/NCCCWAProjects.htm. 

Non-redundant clustering and assembling and annotation 

Rainbow trout ESTs and gene sequences (NPs) were down¬ 
loaded from GenBank and assembled and annotated as de- 
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Fig. 2. BLAST scores obtained by BLASTN and BLASTX for every indi¬ 
vidual ESTs to nucleotide and protein databases. BLAST analysis resulted in 
54% of the ESTs with a significant BLASTN hit and 58% with a significant 
BLASTX hit at e-value cutoff of le-5. The ratio of putative homologies iden¬ 
tified by nucleotide-protein searching (BLASTX) versus nucleotide-nucleo¬ 
tide searches (BLASTN) increased dramatically with BLAST scores. 


scribed to produce a Rainbow Trout Gene Index (RTGI), at 
http://www.tigr.org/tdb/tgi/rtgi/. An overview of the clustering, 
assembling and annotation of the transcript sequences for the 
rainbow trout gene index build and release are shown in 
Fig. 3. 

Quality control filtering of rainbow trout ESTs from dbEST 
included both sequences from this and other projects and elimi¬ 
nated 25 (25 from this project) based on low complexity, 558 
(554) based on matches to E. coli , 72 (72) as they were of size 
shorter than 100 base pairs, and 78 (69) matching the UniVec 
database. 

The RTGI was built from 47,621 rainbow trout sequences, 
including 46,787 ESTs and 834 NPs, and consisted of 25,625 
total unique sequences including 7,956 TCs, 190 singleton NPs 
(sNPs), and 17,479 singleton ESTs (sESTs). TCs (7,646) con¬ 
taining no NPs were assembled from 27,860 ESTs, while 60 
TCs were assembled from only known genes (218 NPs) and 
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Fig. 3. Schematic overview of the clustering, 
assembling and annotation of rainbow trout 
transcript sequences in the Rainbow Trout Gene 
Index, with the number of sequences at each stage 
in the process. All rainbow trout ESTs, full-length 
cDNAs and coding sequences are submitted to 
clustering and assembly resulting in unique sin¬ 
gletons and Tentative Consensus sequences (TCs) 
with a minimum requirement of 40 bp overlap 
and 94% sequence identity. Nucleotide-Protein 
sequences (NPs) typically have annotation there¬ 
fore TCs containing NPs and NP singletons re¬ 
quire no further annotation. ESTs and TCs con¬ 
taining only ESTs having DNA Protein Search 
(DPS) matches are further annotated with Gene 
Ontology assignments. 



1,438 ESTs were clustered with 426 previously known genes to 
form 250 TCs. 

Annotation of TCs containing NP sequences and those con¬ 
sisting only of ESTs are treated differently in annotation. For 
each TC containing an NP, the definition line associated with 
NP’s GenBank record was parsed and used as the annotation 
for that transcript. TCs containing only ESTs and singleton 
ESTs were searched against a non-redundant amino acid data¬ 
base (NRAA) using DPS and the top hit with a minimum score 
of 350 was used as the annotation. Of the 7,647 TCs containing 
only EST sequences, 3,912 were annotated using this approach; 
the remaining 3,740 TCs containing 10,727 ESTs could not be 
annotated based on protein similarity (Fig. 3). Of the 17,479 
singleton ESTs, only 4,690 hit proteins in NRAA, while 12,789 
had no significant protein matches (Fig. 3). Among the 
matched proteins in NRAA for rainbow trout sequences (TC + 
sEST), approximately 10% are “fish” proteins from zebrafish, 
catfish, medakafish, and others. 

The assembling process effectively increased the average 
length of the sequences to 908 nucleotides (Fig. 4A). The 
majority of TCs (72.5%) are in the range of 600-1,200 bp in 
length while 157 TCs (2% of total TCs) are longer than 2,000 
nt. The longest TC (TC3329, 6,707 bp) is comprised primarily 
of NPs and encodes a glucocorticoid receptor. The abundance 
of any transcript can be estimated by the number of component 
ESTs (depth) contained in its corresponding TC. The average 
number of component ESTs in the TCs is 3.7. While there are 
some abundant transcripts (5.4% with ten or more components 
and 20% with five or more), nearly 51% of the TCs contain 
only two sequences and 20% contain three components 


(Fig. 4B). The deepest (most abundant) TC is TCI with 82 
component ESTs, probably encoding Ca 2+ -ATPase. TCs with 
greater length or more component sequences are of greater 
probability having protein hits. 

Gene ontology analysis 

Gene Ontology (GO) has been widely accepted as a general 
means to annotate DNA sequences (www.geneontology.org; 
The Gene Ontology Consortium, 2001), providing a controlled 
vocabulary for describing the cellular localization, molecular 
function, and biological process associated with each encoded 
protein. The GOpep database is comprised of available amino 
acid sequences that have been manually assigned an effective 
GO id and GO term. All unique sequences (TC + sEST + sNP) 
were searched against a GOpep database to associate nucleo¬ 
tide sequences with GO annotated protein sequences. The 
putative GO id and GO term associated with the top hits, based 
on a minimum DPS score of 350, were assigned to that nucleo¬ 
tide sequence. Of 25,625 rainbow trout transcripts, 5,251 
(20 %) were assigned one or more GO terms. 

GO assignments for molecular function, biological process, 
and cellular localization include various levels of assignment in 
the form of a controlled vocabulary with increasing specificity 
of assignment as the “level” of annotation increases. The num¬ 
ber of sequences which can be assigned a GO term remained 
relatively stable from level 1 to level 4 (Fig. 5). At level 5 or 6, 
the number of assigned sequences in the hierarchy of cellular 
component and molecular function dropped dramatically; very 
few sequences can be assigned very specific terms deep within 
the hierarchy. 
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Fig. 4. Assembly summary for the Tentative 
Consensus sequences (TCs). (A) Length distribu¬ 
tion shown in base pairs as TCs with and without 
protein matches from the DNA Protein Search 
(DPS). The majority of TCs (72.5%) are in the 
range of 600-1,200 bp in length while 157 TCs 
(2% of total TCs) are longer than 2,000 nt. 
(B) Number of component sequences (ESTs or 
NPs) within each TC showed as TCs with and 
without protein matches from the (DPS). The 
average number of components is 3.7, 5.4 % have 
ten or more components and 20% have five or 
more. Nearly 51 % of the TCs contain two compo¬ 
nents and 20 % contain three. 
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Fig. 5. Gene Ontology (GO) annotation of the 
unique set of rainbow trout transcript sequences 
(TCs + singletons), showing the number of se¬ 
quences assigned annotation for all three GO cat¬ 
egories as a function of depth within the GO hier¬ 
archy. G0:0003673 (GO term: gene ontology), as 
the root for all the GO hierarchy, was counted as 
level 0. 
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Table 1. Gene Ontology analysis of the unique 
rainbow trout sequences. Rainbow trout unique 
sequences (25,625) were searched against GOpep 
using DNA Protein Search. The number of se¬ 
quences, which were assigned to the hierarchy 
(GO hierarchy level 2) of subcategories of molecu¬ 
lar function, cellular component and biological 
process, and percentage of assigned individual 
categories are shown. 


Category GOid GOterm 


No. sequences Percentage 


Function 


Component 


Process 


G0:0003774 

motor activity 

59 

0.8% 

G0:0016329 

apoptosis regulator activity 

29 

0.4% 

G0:0003793 

defense/immunity protein activity 

40 

0.5% 

G0:0015465 

lysin activity 

1 

0.0% 

G0:0004871 

signal transducer activity 

481 

6.5% 

G0:0008435 

anticoagulant activity 

18 

0.2% 

G0:0005488 

binding activity 

2499 

33.6% 

G0:0045182 

translation regulator activity 

155 

2.0% 

G0:0008369 

obsolete 

48 

0.6% 

G0:0008580 

cytoskeletal regulator activity 

3 

0.0% 

G0:0030234 

enzyme regulator activity 

141 

1.9% 

G0:0008638 

protein tagging activity 

7 

0.1% 

G0:0005215 

transporter activity 

700 

9.4% 

G0:0030533 

triplet codon-amino acid adaptor activity 

4 

0.1% 

G0:0015070 

toxin activity 

1 

0.0% 

G0:0030528 

transcription regulator activity 

321 

4.3% 

G0:0016209 

antioxidant activity 

15 

0.2% 

G0:0003824 

enzyme activity 

2185 

29.4% 

G0:0005194 

cell adhesion molecule activity 

35 

0.5% 

G0:0003754 

chaperone activity 

82 

1.1% 

G0:0005554 

molecularfunction unknown 

266 

3.6% 

G0:0005198 

structural molecule activity 

354 

4.8% 


G0:0005623 cell 3472 74.4% 


G0:0019012 

virion 

14 

0.3% 

G0:0005576 

extracellular 

763 

16.4% 

G0:0008370 

obsolete 

47 

1.0% 

G0:0005941 

unlocalized 

63 

1.4% 

G0:0008372 

cellularcomponent unknown 

306 

6.6% 


G0:0000004 

biological process unknown 

277 

4.3% 

G0:0007275 

development 

561 

8.7% 

G0:0007582 

physiological processes 

3511 

54.4% 

G0:0016032 

viral life cycle 

2 

0.0% 

G0:0009987 

cellular process 

2048 

31.7% 

G0:0007610 

behavior 

52 

0.8% 


Table 2. Representation of orthologs of rainbow trout genes in the TIGR 
EGO database. Unique sequences (including TCs and singleton ETs) from 
individual TIGR gene indices were compared pair-wise and sequence-spe¬ 
cific best hits were associated together to generate EGO groups (Lee et al., 
2002). The number of best match pairs between rainbow trout and other 
species and average percent identity of the matches were extracted from the 
EGO database. 


EGO pair 

EGO pairs 

Average identity 

Rainbow Trout-Human 

4,235 

71.23% 

Rainbow Trout-Mouse 

4,011 

71.03% 

Rainbow Trout-Rat 

2,911 

71.30% 

Rainbow Trout-Porcine 

2,097 

71.38% 

Rainbow Trout-Cattle 

2,733 

71.45% 

Rainbow Trout-Chicken 

3,327 

71.29% 

Rainbow Trout-Frog 

3,034 

70.95% 

Rainbow Trout-Zebrafish 

3,222 

74.78% 

Rainbow Trout-Catfish 

621 

75.74% 

Rainbow Trout-Medakafish 

2,334 

75.72% 

Rainbow Trout-Sea squirt 

1,428 

65.38% 

Rainbow Trout-Arabidopsis 

1,030 

64.54% 

Rainbow Trout-Yeast 

467 

63.00% 


GO assignments within each of the three major functional 
categories are summarized in Table 1. While most of the gene 
products remain within the cell, 16 % were annotated as extra¬ 
cellular proteins. The majority of the annotated genes are asso¬ 
ciated with physiological and cellular processes. Within molec¬ 
ular function assignments, approximately 60% of the se¬ 
quences are related to enzyme (29%) and/or binding (33%) 
activity. Other highly represented functional classes include 
transporter (9%), signal transducer (6%), structural molecule 
(5 %), transcription (4%) and translation (2 %). The broad range 
of the activities represented by these diverse gene functions 
could well result from combination of six tissues sources for the 
cDNA clones. 

Orthologous genes from other related species 

TCs from nearly 60 organisms have been clustered based on 
conserved homology to generate Eukaryotic Gene Ortholog 
(EGO) groups (Lee et al., 2002). The EGO pairs involving rain¬ 
bow trout and other related species were extracted from the 
EGO database (Table 2). Human and mouse represent the 
most frequently matched species for rainbow trout, which is 
due to the relative “completeness” of the gene coverage pro- 
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vided by the extensive genomic and EST sequence data avail¬ 
able for these organisms. Other important organisms with high 
agricultural value including pig, cattle, and chicken had consid¬ 
erable numbers of ortholog pairs including rainbow trout and at 
a similar level of identity as did human and mouse. The fish 
species, including zebrafish, catfish, medakafish, match fewer 
trout sequences but with a higher degree of identity. While 
some highly conserved trout genes, including structural mole¬ 
cules and enzymes, had orthologous pairs from distantly relat¬ 
ed species such as Arabidopsis and yeast, the number of the 
pairs and the relative degree of identity remains lower. 

Summary 

A large number of EST sequences were obtained by single 
pass five prime sequencing of a multi-tissue normalized cDNA 
library. More than 50% were functionally annotated based on 
comparative analysis to facilitate and enhance genomic re¬ 
search in rainbow trout and other salmonids. The ESTs were 
used to construct a Rainbow Trout Gene Index (RTGI) data¬ 
base organizing sequences into clusters and assemblies to pro¬ 


duce virtual transcripts (TC) that are annotated and provided 
as a resource to the research community. The RTGI will con¬ 
tinually be updated as new EST and gene sequence data 
becomes publicly available. Annotation and sequence informa¬ 
tion will prove useful in the identification of ESTs of interest 
for inclusion into microarrays, mapping positional candidate 
genes (Collins, 1995), and developing comparative maps of 
Type I loci (O’Brien et al., 1993). Novel sequences which are 
not functionally annotated could be included in functional 
genomic experiments to evaluate their expression patterns. 
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The 13th North American Colloquium on Animal Cytogen¬ 
etics and Gene Mapping took place at the Louisville Zoo in 
Louisville, Kentucky, USA on July 13-17, 2003. This collo¬ 
quium has been held in North America on alternate years 
bringing together leading scientists working on animal genom¬ 
ics research, specifically cytogenetics, gene mapping, compara¬ 
tive genomics, and evolution. The venue of the Louisville Zoo 
was chosen to highlight research on the conservation of wildlife 
species. Thirty scientists, including nine graduate students and 
postdoctoral scholars, participated in the colloquium. The pre¬ 
sentations ranged from development of framework gene maps 
in domestic livestock species to chromosome evolution in 
endangered species. 

The invited speakers included Kurt Benirschke from the 
University of California San Diego Medical Center and Evan 
Eichler from Case Western Reserve University School of Medi¬ 
cine and the Center for Computational Genomics. Dr. Be¬ 
nirschke discussed the importance of chromosomal structural 
change in mammalian evolution. He presented several exam¬ 
ples of how chromosomal fusion led to new species in Apennine 
mice and several species of apes. Dr. Benirschke emphasized 
the importance of carefully characterizing the genetics of each 
species, preserving cells from all species and the potential of 
cloning endangered species. 

Dr. Eichler discussed the origin and impact of segmental 
duplications in the human and other mammalian genomes. 
The recent publication of the completed or nearly completed 
genome sequences for humans and mice has led to the identifi¬ 
cation of de novo segmental duplications in these species, many 


of which appear to have occurred in concert with speciation. 
Duplications provide raw material for genetic adaptation and 
the extent of the changes emphasize the importance of distin¬ 
guishing between homologues, orthologues and paralogues with 
comparative gene mapping. 


Update on the horse genome mapping project 

E. Bailey 

Department of Veterinary Science, MH Gluck Equine Research Center, 

University of Kentucky, Lexington, KY (USA) 

Horse gene mapping has been the subject of a close-working community 
of scientists since formation of the workshop in 1995. The workshop has met 
in connection with the USDA-NRSP8-San Diego Meeting every January, at 
Havemeyer Foundation conducted workshops every two years (Kentucky/ 
San Diego/Sweden/Australia/South Africa) and at the ISAG meetings. The 
next meeting is a Havemeyer workshop in August 2003 in South Africa. 
Approximately 100 scientists from 25 laboratories, worldwide, participate in 
the workshop. The earliest activities of the workshop directly involved cyto¬ 
genetic studies. Zoo-FISH work by Raudsepp, Chowdhary and others dem¬ 
onstrated extensive conservation of genomic organization for humans and 
horses. The cytogeneticists in the group collaborated to produce and publish 
an idiotype for the horse that proved essential for comparing results between 
laboratories. Three linkage maps have been published based on two half-sib 
maps (Sweden and International Reference Family) and a full-sib family 
(Newmarket, UK). The combined maps include over 500 markers. A consen¬ 
sus CRIMAP analysis of these data will be the subject of the August Have¬ 
meyer workshop. Approximately 350-400 genes and DNA markers have 
been FISH mapped by scientists from many laboratories, including scientists 
present at this meeting. When markers are FISH mapped and linkage map- 
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ped, they identify the chromosomal location of linkage groups and their 
orientation with respect to the centromeres. Another valuable tool has been a 
synteny mapping panel developed at the University of California, Davis. The 
synteny panel was the first means for identifying the chromosomal location 
of many linkage groups and valuable for independent confirmation of FISH 
mapping assignments. Recently, scientists from Texas A&M University 
developed a radiation hybrid panel for the horse and published a first genera¬ 
tion map with 715 markers, including microsatellite markers and gene mark¬ 
ers. The RH map provides an opportunity to integrate genetic markers and 
genes as well as to develop a higher resolution comparative map between the 
horse and other species, especially the human. 

So far we have slightly more than 1,000 markers mapped by at least one 
approach and most with use of multiple tools. Our objectives for the next two 
years will be set at the meeting in South Africa but are most certainly going to 
involve a minimum of doubling the number of mapped markers by 2005. 
However, the map has already demonstrated its usefulness for genome scan¬ 
ning studies to identify genes responsible for simple Mendelian traits includ¬ 
ing SCID, hair color dilution and gray coat color. But the traits of greatest 
interest are those related to muscle and bone diseases, susceptibility to infec¬ 
tious diseases and other complex traits. Identifying the genetic components 
for these traits will require more powerful tools and excellent family and 
diagnostic material. 

Certainly, the horse genome will be sequenced one day. As discussed at 
this meeting, the horse occupies an important niche in the phylogenetic tree 
of animals. But with the strong homology of gene organization that exists 
among vertebrate species, a dense gene map will be useful to address the 
research questions that drive the workshop participants. 


The importance of chromosomal structural change in 
mammalian evolution 

Invited Speaker: K. Benirschke 

University of California San Diego Medical Center, San Diego, CA 
(USA) 

Chromosomal “errors” occur frequently in human conceptions - they are 
a common cause of spontaneous abortion. This is not the case in observed 
pregnancies of zoo animals, and relatively few errors have been described in 
any mammal. On the other hand, major chromosomal change must occur 
occasionally as a cause of significant evolution. The best-studied examples 
are the Apennine mice of Switzerland and Italy in which various fusions of 
acrocentrics have led to numerous new species. When studies of known taxo¬ 
nomic evolutionary events are charted, chromosomal fusion appears to be a 
major mechanism of speciation (Rupicaprinae; immigration of mammals to 
South America; apes, etc.). The consequence of such evolutionary steps is the 
initiation of a new species from a stock that has a very small genetic variabil¬ 
ity (“inbreeding depression” may result). Other mechanisms such as inver¬ 
sions (fission?), while common in some groups (Atelinae; Orangutans), are 
less well explored and continued hybridization is common in such captive 
animals - it should be prevented. But, we are at the end of animal importa¬ 
tions and thus, we are also at the end of our ability to study these features. For 
this reason the San Diego Zoo has espoused the creation of a “Frozen Zoo”, 
to serve as a reservoir for future studies of chromosomes and the genomes. Its 
origin and usage will be discussed briefly. The possibility of these cells for 
future cloning efforts exists and has already been used. For cloning, however, 
knowledge of placentation is essential and it is currently too poorly under¬ 
stood. For that reason, a web site was established to gather relevant knowl¬ 
edge of comparative placentation (http://medicine.ucsd.edu/cpa). This will 
be discussed briefly. A plea is made for the systematic collection of cells from 
all species and the accumulation of reproductive features, such as placenta¬ 
tion. 


Development of an ordered genomic microarray for 
comparative genomic hybridization analysis of canine 
cancer 

R. Thomas, b H. Fiegler, c A. Scott , 3 R. Hudson, b L. Sabacan, d 
C. Andre, e T. Lorentz, d C. Hitte, e H. Parker, d R. Guyon, e 
E. Ostrander, d F. Galibert, e N.P. Carter 0 , M. Breen 3 

a College of Veterinary Medicine, NCSU, Raleigh, NC (USA); b Animal 
Health Trust, Newmarket, Suffolk (UK); c The Wellcome Trust Sanger 
Institute, Cambridge, (UK); d Fred Hutchinson Cancer Research 
Center, Seattle, WA (USA); e UMR 6061 CNRS, Faculte de Medecine, 
Rennes (France) 

Comparative genomic hybridization (CGH) is a fluorescence in situ 
hybridization (FISH) technique used widely in cancer cytogenetics for the 
comprehensive analysis of imbalanced chromosomal material within an 
entire tumor genome. Unlike conventional cytogenetics, CGH enables all 
major copy number changes within a tumor genome to be visualized simulta¬ 
neously in a single reaction. We recently developed robust metaphase-based 
CGH techniques for the dog and are now using this approach routinely for a 
variety of ongoing cancer studies. 

We are also developing array-based canine CGH as a means to increase 
the resolution and throughput of CGH analysis for canine tumor cytogene¬ 
tics. Eight hundred canine bacterial artificial chromosome (BAC) clones 
have been assembled into chromosome-specific tiling panels on the basis of 
their genomic location as determined by FISH and/or radiation hybrid map¬ 
ping analysis. Multicolor FISH was used to verify the assignment of each 
clone and determine the accurate cytogenetic order along the length of each 
dog chromosome. DNA from each clone has been amplified by degenerate 
oligonucleotide primed (DOP) PCR and printed onto glass slides to form an 
ordered microarray. 

A preliminary canine CGH array, comprising 26 canine cancer genes and 
61 additional canine BAC clones, has been developed and tested by a series 
of validation experiments using both normal and tumor DNA samples. The 
results confirmed the ability of the array to detect genomic imbalances that 
correlate with data obtained previously with conventional metaphase CGH. 
We are now progressing towards the generation of the genome-wide array, 
with clones spaced at 3-5 Mb intervals. 


Understanding the origin and impact of segmental 
duplications: A comparison of human and rodent 
genomes 

Invited Speaker: E.E. Eichler 

Case Western Reserve University School of Medicine and the Center 
for Computational Genomics, Cleveland, OH (USA) 

It has been estimated that 5 % of the human genome consists of inter¬ 
spersed duplicated material that has arisen over the last 30-40 million years 
of evolution. A large proportion of these duplications exhibit an extraordi¬ 
narily high degree of sequence identity at the nucleotide level (>95 %) span¬ 
ning large (1-100 kb) genomic distances. Through processes of non-allelic 
homologous recombination, these same regions are targets for rapid evolu¬ 
tionary turnover among the genomes of closely related primates as well as a 
significant source of genetic disease in the human population. Preliminary 
analyses have suggested that the amount of segmental duplication may be a 
relatively unique property of our genome. We have begun a detailed analysis 
of duplication content from other vertebrate genomes to assess the extent of 
this genomic property. I will present a comparative overview of the organiza¬ 
tion of recent segmental duplications within the human and mouse genomes 
and provide an analysis of the breakpoints of these duplications, which sug¬ 
gest new insight into their mechanism of origin and chromosomal evolution. 
Further, our data indicate that a small fraction of important human genes 
may have emerged recently through duplication processes and will not pos¬ 
sess definitive orthologues in the genomes of model organisms. 
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Cytogenetic correlates of reproductive failure in 

species/subspecies crosses in zoos 

M.L. Houck 

Center for Reproduction of Endangered Species (CRES), Zoological 

Society of San Diego, San Diego, CA (USA) 

Cytogenetic analysis has become an essential tool in the management of 
captive animals. Cryptic chromosomal variation among individuals thought 
to be a single species (based on phenotype) have been reported in many cap¬ 
tive populations, including artiodactyls (e.g. Soemmerring’s gazelle, dik dik, 
waterbuck), and primates (e.g. spider monkeys, squirrel monkeys, owl mon¬ 
keys). It is not always clear whether crosses among populations of uncertain 
taxonomic status are differentiated at the specific or subspecific level, but the 
reproductive consequences often include sterility or reduced fertility of off¬ 
spring. Current research has identified this phenomenon in genets (Viverri- 
dae), nocturnal carnivores that inhabit forests, savannahs and grasslands of 
Europe and Africa. 

The taxonomy of the genus Genetta is not well understood, and misiden- 
tification of animals has occurred frequently. In captivity, animals are often 
grouped phenotypically by spot size into two groups, large-spotted and small- 
spotted. Cytogenetic studies of the San Diego Zoo genet population identi¬ 
fied diploid numbers of 52 and 54 in animals with large spots (G. maculata 
and G. tigrina) and a third distinct 2n = 54 karyotype in animals with small 
spots (G. genetta). Karyotype analysis of two male and two female large spot¬ 
ted genets indicated different diploid numbers for the males (2n = 54) and 
the females (2n = 52). The cytogenetic data along with expert identification 
of the phenotype later allowed positive species assignment of tigrina for the 
males and maculata for the females, but not before breeding of one pair 
resulted in twin births, followed by a single birth the next year. Although only 
one of the offspring survived, skin biopsies were collected from all three, 
allowing determination of the diploid number in each as 2n = 53. However, 
G-band analysis indicates that the karyotypic difference is not due to a sim¬ 
ple fission/fusion event. Major rearrangements including chromosome 
breaks, translocation and fusions have occurred between the 2n = 52 and 
2n = 54 karyotypes suggesting that although FI hybrids may be conceived 
and survive, they will be sterile or subfertile. Further studies using FISH are 
recommended to clarify the complex chromosomal rearrangements observed 
in this genus. 


A frame-work for developing the pig QTL database 

Z. Hu and M. Rothschild 

Department of Animal Science, Iowa State University (USA) 

One application of genetic linkage maps in livestock species is mapping 
loci underlying the genetic differences of economically important traits. As a 
result of active mapping of quantitative trait loci (QTL) in pigs during the 
past decade, hundreds of QTL in pig genome have been reported for growth, 
meat quality, reproduction, disease resistance and other traits. We are devel¬ 
oping a pig QTL database to allow easy search and comparison of publicly 
available QTL data. A few approaches have been taken to accommodate the 
complex need for QTL information storage, organization and presentation. 
We have introduced a “trait ontology” concept to standardize the way animal 
traits are named and to simplify the way that the traits may be organized, in 
order for comparisons of QTL data to be possible. The database schema is 
based on a relational database, making use of existing pig map databases and 
other publicly available databases resources. The pig QTL database is also 
designed to include data representing major genes and markers having large 
effect on economically important traits. Efforts are undertaken to make it 
part of the integrated functional genomics resources for pigs. 


Comparative cytogenetics and gene mapping in 

rhinoceroses 

T.L. Lear , 3 M. Houck b , O.A. Ryder b 

a M.H. Gluck Equine Research Center, Veterinary Science Department, 

University of Kentucky, Lexington, KY; b Center for Reproduction of 

Endangered Species, Zoological Society of San Diego, San Diego, CA 

(USA) 

Despite intensive conservation efforts, rhinoceroses are still highly 
endangered and difficult to reproduce in captivity. Reproduction in captive 
rhinoceroses may result in offspring sterility, embryonic loss, and stillborn 
births. Potential involvement of chromosomal factors in their poor reproduc¬ 
tion needs to be examined. Karyotyping can be used to assess cytogenetic 
status and breeding soundness, however it is difficult to make these assess¬ 
ments in the rhinoceroses using standard cytogenetic methods. Rhinoceros 
cells are difficult to culture. Their chromosomes have p-arm polymorphisms, 
numerous small chromosomes, and fission-fusion elements that complicate 
analyses. Individuals with incompatible chromosomes (different subspecies) 
or with chromosome abnormalities may be bred resulting in fewer offspring. 
Identification of subspecies and abnormal individuals is imperative for suc¬ 
cessful reproduction of this endangered species. 

In order to characterize and identify rhinoceros chromosomes we are 
mapping horse BAC clones by fluorescence in situ hybridization (FISH) to 
the chromosomes of four rhinoceros species. This will enable the identifica¬ 
tion of 1) homologous chromosome pairs regardless of chromosome mor¬ 
phology, 2) fission-fusion elements seen in some individuals, 3) the Y chro¬ 
mosome, 4) gross karyotypic differences between subspecies and 5) charac¬ 
terize p-arm polymorphisms. 

Using FISH, 49 horse BAC clones were hybridized to Indian rhinoceros 
chromosomes, identifying conservation of chromosome morphology (as a 
metacentric or arocentric chromosome) and gene order when compared to 
horse chromosomes 1, 14, 16, 20, 21, 23, 29, 31, X and Y. Lack of conserva¬ 
tion was seen for horse chromosomes 2, 3, 4, 5, 8, and 11. P-arm polymor¬ 
phisms were identified for chromosomes homologous to horse chromosomes 
8 and 20. In the Indian rhinoceros, the Y chromosome specific probes, ZFY 
and SRY, localized to a small acrocentric chromosome about the same size as 
that found in the domestic horse. In addition, a telomere probe localized at a 
terminal position on Indian, Eastern black and Southern white rhinoceroses 
chromosomes. 

Supported by the Morris Animal Foundation. 


Towards generating a 1-Mb resolution map of the horse 
X chromosome 

E.-J. Lee, T. Raudsepp, L. Skow, B.P. Chowdhary 

Department of Veterinary Anatomy & Public Health, College of 
Veterinary Medicine, Texas A&M University, College Station TX 
(USA) 

The mammalian X chromosome comprises ~ 5 % of the total genome 
and is known to be highly conserved. Similarities in X chromosome banding 
patterns across a variety of species, and the gene mapping data, provides 
credence to this hypothesis. The horse X chromosome shares striking simi¬ 
larities in banding patterns with the human counterpart. Recently, based on a 
43-marker radiation hybrid (RH) and comparative map, we showed that the 
horse and human X chromosomes also have similar gene order. Expanding 
on this work, we are developing a high resolution map for ECAX, with an 
aim to have one marker every megabase of the equine chromosome. For this, 
we are choosing human genes located at 1-Mb interval on the human X chro¬ 
mosome, conducting multiple alignment of human, mouse and other mam¬ 
malian orthologous sequences, and identifying highly conserved regions to 
develop horse specific primers. The amplification product from these prim¬ 
ers in horse is verified by sequencing. The primers are then used for RH 
mapping on the horse x hamster 5000 rad RH panel and for screening equine 
BAC libraries. The BACs are being used for cytogenetic mapping and devel- 
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opment of microsatellite markers for linkage analysis. At present, we have 
developed an RH map comprising a total of 147 markers, of which 112 are 
genes and 35 microsatellites. This represents the most comprehensive com¬ 
parative map for the X chromosome among the domestic species. Compara¬ 
tive organization of the equine X chromosome in relation to homologues in 
human, mouse and other species will be discussed. Additionally, strategies to 
expand the current work with an aim to target disease genes located on the X 
chromosome will be elaborated. 


Comparative mapping of the major histocompatibility 
complex locus in the family Equidae 

C. M. Mains , 3 T.L. Lear , 3 M.L. Houck, b R. Tallmadge, c 

D. Antczak 0 , E. Bailey 3 

a Maxwell H. Gluck Equine Research Center, Dept. Veterinary Science, 
University of Kentucky, Lexington, KY; b Center for Reproduction of 
Endangered Species, Zoological Society of San Diego, San Diego, CA; 
c James A. Baker Institute for Animal Health, College of Veterinary 
Medicine, Cornell University, Ithaca, NY (USA) 

There are at least ten species within the family Equidae. The diploid 
number of each species ranges from 2n = 32 in Equus hartmannae to 2n = 66 
in E. przewalskii, indicating that dramatic changes have occurred over the 
past 3-5 million years since diverging from a common ancestor. The major 
histocompatibility complex (MHC) is an important mediator of immune 
response and exhibits high variability that aids in responding to a diverse 
array of diseases. The MHC maps to human chromosome 6p21.3 and chro¬ 
mosome 20ql4-q22 in the domestic horse, E. caballus. Furthermore, the 
orientation of the horse MHC, with respect to the centromere, is the opposite 
of that found in other mammals. In this project, we wished to determine if 
the synteny and orientation of the MHC was conserved among the equids. 
Four horse BAC clones, two containing MHC class I genes and two contain¬ 
ing MHC class II genes, were mapped using fluorescence in situ hybridiza¬ 
tion (FISH) in four of the extant equids: E. kiang (EKI), E. grevy (EGR), 

E. burchelli (EBU), and E. africanus somaliensis (EAF). The map positions of 
class I and class II genes in these equids demonstrated conservation of synte¬ 
ny for the MHC. In EKI, EGR and EBU the MHC class I and II genes map¬ 
ped to the q-arm of a single medium metacentric chromosome pair, while the 
genes mapped to the q-arm of a large metacentric chromosome pair in EAF, 
possibly EAF3. Banding studies are underway to identify each chromosome. 
In addition, further comparative mapping for all equid species is underway 
to determine if the orientation of the MHC to surrounding genes is conserved 
in this family. 


Cytogenetic screening of a family of gaur 

G.F. Mastromonaco , 3 G. Crawshaw, b N.M. Loskutoff c , 

W.A. King 3 

a Department of BiomedicalSciences, University of Guelph, Guelph, 

ON; b Toronto Zoo, Scarborough, ON (Canada); c Henry Doorly Zoo, 
Omaha, NE (USA) 

The gaur (Bos gaurus), native cattle of southeast Asia, is currently listed 
as endangered. A captive breeding program has been implemented to main¬ 
tain the current level of genetic diversity. However, due to the small founder 
stock and 50 years of restricted breeding, the captive herd is showing signs of 
inbreeding and reduced fertility. Chromosome abnormalities, in particular 
Robertsonian translocations, have been associated with reduced reproduc¬ 
tive potential in Bovinae. As the frequency of carriers increases in the popu¬ 
lation, the breed fertility often begins to decline. To maintain genetic integri¬ 
ty and improve reproductive efficiency, principles applied in domestic cattle 
breeding for screening donor individuals should be implemented in the gaur 
breeding program. Establishment of screening protocols for the assessment of 
reproductive potential in this and other exotic bovids will provide an invalu¬ 


able tool for the efficient use of individuals and space. Recently, cells were 
banked from a female gaur at Toronto Zoo who died unexpectedly. In a pre¬ 
liminary analysis, it was found that the individual contained 2n = 57 chro¬ 
mosomes, instead of the normal 2n = 58, with an extra submetacentric and 
the loss of two acrocentric chromosomes being observed. The objective of 
this study was to examine the karyotype of immediate family members to 
determine whether the translocation arose de novo in this individual or was 
inherited. Skin biopsies from individuals in the captive population were 
obtained remotely using biopsy darts. Fibroblast cell cultures were grown and 
stored frozen until karyotype analysis could be carried out. The Toronto Zoo 
female and three close relatives (brother, daughter, stillborn grandson) have 
been examined to date, with no further abnormalities being detected. Analy¬ 
sis of related individuals currently housed at other zoos is pending. 


Conservation of a Robertsonian chromosome 
polymorphism in the Equidae 

J.L. Myka , 3 T.L. Lear , 3 M.L. Houck, b O.A. Ryder b , E. Bailey 3 

a M.H. Gluck Equine Research Center, Department of Veterinary 
Science, University of Kentucky, Lexington, KY; b Center for 
Reproduction of Endangered Species, Zoological Society of San Diego, 
San Diego, CA (USA) 

A centric fission (Robertsonian translocation) polymorphism has been 
previously documented in five of the ten extant equid species, namely, 
E. hemionus onager, E. hemionus kulan, E. kiang, E. africanus somaliensis, 
and E. quagga burchelli. Here we report evidence that the centric fission 
polymorphism involves the same homologous chromosome segments and 
has homology to human chromosome 4 (HSA4). Bacterial artificial chromo¬ 
some (BAC) clones containing equine (E. caballus, ECA) genes SMARCA5 
(ECA2q21 homologue to HSA4p) and UCHL1 (ECA3q22 homologue to 
HSA4q) were mapped to a single metacentric chromosome and two unpaired 
acrocentrics by FISH mapping for individuals possessing odd numbers of 
chromosomes. These data suggest that the polymorphism is either ancient 
and conserved within the genus or has occurred recently and independently 
within each species. Since these species are separated by 1-3 million years of 
evolution, the persistence of this polymorphism would be remarkable and 
worthy of further investigations. 


A comparative gene map for the onager, Equus hemionus 
onager 

J.L. Myka , 3 T.L. Lear , 3 M.L. Houck, b O.A. Ryder b , E. Bailey 3 

a M.H. Gluck Equine Research Center, Department of Veterinary 
Science, University of Kentucky, Lexington, KY; b Center for 
Reproduction of Endangered Species, Zoological Society of San Diego, 
San Diego, CA (USA) 

While separated by less than 3.7 million years of evolution, the onager 
(E. hemionus onager, EHO) karyotype contains a modal diploid chromo¬ 
some number of 2n = 56 while the domestic horse {E. caballus, ECA) has a 
diploid chromosome number of 2n = 64. Also, the onager has a documented 
centric fission (Robertsonian translocation) within its population, resulting 
in individuals with a diploid chromosome number of 2n = 55. To explore the 
chromosome changes between these two equids, a comparative gene map 
was constructed for the onager by FISH mapping 50 BAC clones containing 
horse genes previously mapped to ECA chromosomes. These clones repre¬ 
sent 42 of 47 ECA chromosome arms including the X chromosome. The 
clones were hybridized to metaphase spreads of an onager with a diploid 
chromosome number of 2n = 55. Several chromosomal rearrangements were 
documented between the two species in this project, and will be presented 
along with the deduced human homology. 
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FISH analysis comparing genome organization in the 
domestic horse (Equus cab alius) to that of the Mongolian 
wild horse (E. przewalskii) 

J.L. Myka , 3 T.L. Lear , 3 M.L. Houck, b O.A. Ryder b , E. Bailey 3 

a M.H. Gluck Equine Research Center, Department of Veterinary 
Science, University of Kentucky, Lexington, KY; b Center for the 
Reproduction of Endangered Species, Zoological Society of San Diego, 
San Diego, CA (USA) 

Przewalski’s wild horse ( E. przewalskii , EPR) has a diploid chromosome 
number of 2n = 66 while the domestic horse ( E. caballus , ECA) has a diploid 
chromosome number of 2n = 64. Debates as to their phylogenetic relation¬ 
ship and taxonomic classification have hinged on comparisons of their skele¬ 
tal morphology, protein and mtDNA similarities, their ability to produce 
fertile hybrid offspring, and on comparison of their chromosome morpholo¬ 
gy and banding patterns. Previous studies of GTG-banded karyotypes sug¬ 
gested that the chromosomes of both equids were homologous and the differ¬ 
ence in chromosome number was due to two pairs of acrocentric chromo¬ 
somes in E. przewalskii and one pair of metacentric chromosomes in E. 
caballus (ECA5). To determine which E. przewalskii chromosomes were 
homologous to ECA5 and to confirm the predicted chromosome homologies 
based on GTG-banding, we constructed a comparative gene map between 
E. caballus and E. przewalskii by FISH mapping 46 domestic horse-derived 
BAG clones containing genes previously mapped to E. caballus chromo¬ 
somes. The results indicated that all E. caballus and E. przewalskii chromo¬ 
somes were homologous as defined by GTG-banding, and that the acrocent¬ 
ric chromosomes EPR23 and EPR24 were homologues of the metacentric 
ECA5. The gross chromosome homology that exists between E. caballus and 
E. przewalskii may have implications as to their phylogenetic relationship 
and species/subspecies designation. 


Identification of a centromeric repeat and verification of 
non-synteny of the MHC class I and II loci in channel 
catfish, Ictalurus punctatus, using fluorescence in situ 
hybridization 

S.M.-A. Quiniou, W.R. Wolters, G.C. Waldbieser 

USDA, ARS, Catfish Genetics Research Unit, National Warmwater 
Aquaculture Center, Stoneville, MS (USA) 

Channel catfish, Ictalurus punctatus (2n = 58), is the most economically 
important finfish species in U.S. aquaculture. Selective breeding of genetical¬ 
ly superior broodstock is supported by quantitative genetic analysis and 
molecular genetic tools such as microsatellite markers, a genetic linkage map, 
EST libraries, and large insert DNA libraries. Cytogenetic analyses are begin¬ 
ning to be applied to increase the understanding of structure and function of 
the catfish genome. Fluorescently labeled probes prepared from selected loci 
were hybridized to metaphase chromosomes prepared from immortalized 
cultured B lymphocytes. Fluorescence in situ hybridization (FISH) of a cat¬ 
fish Xba repetitive element to metaphase chromosomes demonstrated hy¬ 
bridization to all centromeres. Fluorescent probes were prepared from BAC 
clones containing the channel catfish MHC class I and class II loci. Genetic 
linkage analysis had determined these two loci were unlinked, and FISH 
analyses verified these loci were on separate chromosomes. Probes from 
three separate BAC clones containing MHC class I genes were co-localized on 
one metaphase chromosome, however, one of these probes was separate on 
interphase nuclei. Karyotype analysis was hindered by the resistance of cat¬ 
fish chromosomes to standard banding methods and also their small size. 
Although identification of individual catfish chromosomes was difficult, 
these experiments demonstrated the utility of FISH in analysis of repetitive 
elements and gene synteny. 


The first physical map - RH and FISH - of the equine 

Y chromosome 

T. Raudsepp, A. Santani, L.C. Skow, B.P. Chowdhary 

Department of Veterinary Anatomy & Public Health, Texas A&M 

University, College Station, TX (USA) 

Studies in human and mouse show that genes located on the Y chromo¬ 
some are pivotal for male fertility. Stallion fertility is of prime importance in 
the equine industry. However, very little is known about the content and 
map organization of the equine Y chromosome (ECAY). This significantly 
hampers studies aimed at identification of markers/genes affecting fertility 
and other traits of significance in the horse. Hence, the present study was 
carried out to develop a basic physical map of the equine Y. For this, we used 
two contemporary approaches: RH and FISH mapping. Primers for 23 
ECAY-specific genes (8), microsatellites (6) and sequence tagged sites (STS; 
9) were obtained from databases or published papers. In some cases, we 
developed markers based on available sequences for human, mouse and oth¬ 
er species. Primer pairs for all markers were typed on our 5000 rad horse * 
hamster RH panel. The primers were also used to obtain clones from two 
equine BAC libraries (TAMU and CHORI-241). The BAC clones were FISH 
mapped individually, as well as in combinations, on horse metaphase chro¬ 
mosomes and interphase chromatin to confirm their Y specificity, detect 
regional locations and obtain relative order of individual markers. A physical 
map spanning almost the entire euchromatic region was obtained. The RH 
and FISH maps complement each other and contribute to the first map of the 
equine Y chromosome. 


A genetic study on the history and kinship of the 
Einsiedler horse 

C. Riggenbach, 3 ' b P.A. Poncet, c M.L. Glowatzki, d 
G. Stranzinger 3 ' b , S. Rieder b 

a Institute of Animal Science, Swiss Federal Institute of Technology, 
Zurich; b Faculty of Veterinary Medicine, University of Zurich, Zurich; 
c Swiss National Stud Avenches; d Institute of Animal Genetics, 
Nutrition and Housing, Faculty of Veterinary Medicine, University of 
Bern, Bern (Switzerland) 

Since about the year 1000 A.D., the Benedictine Abbey Einsiedeln in 
central Switzerland was known for its horse breeding activities and livestock 
trading (“cavalli della madonna”). Einsiedler horses were also known and 
dispersed as horses from Schwyz and Napoleon recruited them on his way to 
Russia in the early nineteen hundreds. In the 1970s the Swiss Sporthorse 
Breeding Association was founded and Einsiedler horses were registered to 
this organization. Swiss sporthorses are a composite breed genetically 
influenced by all kinds of European sporthorse breeds (e.g. from France, Ger¬ 
many, Ireland, Sweden). The goal of the present study was to analyze whether 
remaining Einsiedler stock, identified from pedigree entries, differs on a 
genetic level from the average of Swiss and European sporthorse populations, 
respectively. 

We first analyzed mtDNA sequences from 100 horses of different horse 
breeds, including Einsiedler horses from the few remaining maternal 
lineages. Special emphasis was given to two apparently most ancient mater¬ 
nal lines. The earliest still existing and available pedigree annotations for 
those, going back as far as the middle of the 19 th century. According to the 
nomenclature proposed by Vila et al. (2001) and Jansen et al. (2002), these 
two Einsiedler lineages clustered into the clades A and D, comprising also an 
important number of horses from iberian populations. Iberian horses were 
well known for their riding capability and were traded among royals and the 
church since the 11 th century. Therefore, it seems plausible to find “iberian 
haplotypes” in a local population of horses, mainly controlled by an abbey for 
several centuries. A second analysis concerned the diversity, demarcation 
and individual assignments of the selected horses, assessed by 44 microsatel¬ 
lite markers, dispersed over all 31 autosomal horse chromosomes and the X 
chromosome. 
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Differential gene expression in epidermis of mice 
sensitive and resistant to phorbol ester tumor promotion 

P.K. Riggs, J.M. Angel, M.Y. Caballero, J. DiGiovanni 

University of Texas M.D. Anderson Cancer Center, Science Park 
Research Division, Smithville, TX (USA) 

Our previous two-stage carcinogenesis studies indicated that genetic con¬ 
trol of susceptibility to tumor promotion by the phorbol ester 12-O-tetrade- 
canoylphorbol-13-acetate (TPA) in crosses between susceptible DBA/2 J and 
resistant C57BL/6J mice is a multigenic trait. We mapped promotion suscep¬ 
tibility loci to distal mouse chromosomes 1 (Psl3), 2 (Psl2), 9 (Psll), and 19 
(Psl4), and narrowed the Psll locus to a ~ 40-cM region. Tumor study data 
from interval-specific congenic mouse strains suggest that at least three genes 
within this chromosomal region may affect the response to TPA. Because 
RNA expression profiling provides a powerful approach to identify potential 
susceptibility genes, we compared gene expression profiles of age-matched 
DBA/2J and C57BL/6J mice treated with TPA or acetone. Total RNA was 
extracted from epidermis, labeled with Cy3 or Cy5, and hybridized to mouse 
cDNA microarrays containing 8,737 unique gene and EST sequences. All 
hybridizations were conducted in duplicate with duplicate dye-flipped sam¬ 
ples. Geometric median fluorescence values for the quadruplicate hybridiza¬ 
tion results were computed, and differentially expressed gene spots were 
identified as those which had signal values twice background in at least one 
strain, Cy3/Cy5 ratio indicated a two-fold difference in expression, and val¬ 
ues were consistent across replicates and dye-flips. Experimental data indi¬ 
cated that TPA influenced expression of hundreds of transcripts. Of those 
genes, 65 exhibited differential expression between the two strains at 6 h after 
treatment, and a slightly smaller group of genes was differentially expressed 
24 h after treatment. Differential expression patterns of ornithine decarbox¬ 
ylase (ODC), tissue inhibitor of metalloproteinase (Timpl), serine (or cys¬ 
teine) proteinase inhibitor (SerpinB2), and an EST were verified by realtime 
quantitative RT-PCR (qPCR) and other assays. Studies are ongoing to verify 
additional differentially expressed genes as potential candidate genes for pro¬ 
motion susceptibility. 

Supported by NIEHS grant ES08355 (J.D.), UTMDACC Core Grant 
CA16672, and NIEHS Center Grant ES07784. 


Clusters of overlapping BACs on the euchromatic region 
of the equine Y 

A.B. Santani, T. Raudsepp, L.C. Skow, B.P. Chowdhary 

Department Veterinary Anatomy and Public Health, College of 
Veterinary Medicine, Texas A&M University, College Station, TX 
(USA) 

Deletions and rearrangements in the human/mouse Y chromosome are 
clearly implicated in male infertility. Detection of these molecular aberra¬ 
tions has been possible only due to a robust map of this chromosome in the 
two species. Compared to these species, very little is known about the Y chro¬ 
mosome in domesticated animals, in particular, the horse. Stallion infertility 
can result in serious financial losses mainly through loss of stud fees. Despite 
this, the role of equine Y-specific genes in regulating fertility is completely 
unexplored. 

Our efforts are currently focused on generating a detailed physical map of 
the Y chromosome in horse. To this end, available gene, microsatellite and 


STS markers were used to obtain BAC clones by screening the TAMU and 
CHORI241 horse BAC libraries. End sequencing of the BAC clones was 
done to obtain new STS markers. Primer sets from these markers were then 
used for chromosome walking. This resulted in the development of a set of 
four contigs of overlapping BAC clones. Two of the overlaps span - 2.5 Mb 
each, while the remaining two individually span, on average, ~ 0.5-0.7 Mb. 
Overall, the BACs cover ~ 40 % of the euchromatic region on the horse Y 
chromosome. The contigs together contain seven functional genes, six micro¬ 
satellites and 40 STSs. 

This marks a significant beginning to our long-term goal of developing a 
contig over the euchromatic region of the equine Y. The contig will provide a 
panel of -100 uniformly spread markers that will be used to develop a 
molecular diagnostic test for early detection of potential fertility problems in 
foals destined for breeding. 


Sequence analysis and population genetics of a 
frameshift mutation in a bovid MHC class I gene 

L.C. Skow, N. Ramachlan, J.E. Womack 

College of Veterinary Medicine, Texas A&M University, 

College Station, TX (USA) 

During experiments to define allelic forms of bovine classical MHC class 
I genes, we identified an allele characterized by a two base deletion in exon 2. 
The frameshift mutation results in premature stop codons in exon 3, suggest¬ 
ing that the protein product, if produced would exist as a soluble class I mole¬ 
cule. Searches of the sequence databases revealed a single entry with the same 
frameshift described in a Holstein. A PCR-based genotyping assay was devel¬ 
oped and used to survey cattle breeds. The deletion allele was detected in 
most breeds at low frequency. However, the deletion allele was the most com¬ 
mon allele observed among feral cattle (Longhorn, Florida scrub). We con¬ 
ducted a similar survey of wild North American bison and found the deletion 
allele to be the most common allele in that bovid species. Haplotype sequenc¬ 
ing of about 1000 bp of flanking sequence identified several deletion haplo- 
types in cattle and bison that were similar but distinctive between species. 
Phylogenetic analysis of the haplotype sequences indicate that the deletion 
predates the divergence of Bos and Bison and suggests that this allelic lineage 
is being maintained by selection, especially in feral and wild bovids. 

Supported by Texas Higher Education Coordinating Board (THECB) 
Advanced Technology Program. 


Analysis of sequence variation in genes expressed in the 

chicken pineal gland 

S. Hartman, J. Wynn, E. Long, E. Smith 

Comparative Genomics Lab, Virginia Tech, Blacksburg, VA (USA) 

A master clock known as the pineal gland regulates almost all the activi¬ 
ties undertaken by animals. An assessment of gene expression as well as the 
identification of mutations that may affect important animal activities and 
function is therefore significant in animal agriculture. Though one other 
group has carried out an analysis of genes expressed in this organ, very little 
public information exists about the characteristics of genes expressed in the 
pineal gland. Here, we present an analysis of 192 novel Gallus gallus ESTs 
involving total DNA sequence of 135,389 base pairs, from a chicken pineal 
gland cDNA library. Additional characterization of the ESTs involved an in 
silico-based candidate SNP analysis using ten of the 37 ESTs that showed a 
significant sequence similarity (>98 %) to a database chicken gene sequence 
within a contiguous region of 600 base pairs or greater. Though knowledge of 
the physiological basis of rhythmic behavior in the chicken is extensive, our 
understanding of the molecular basis of circadian rhythms in the chicken is 
marginal. The data reported here lays the foundation for understanding the 
genetics of this important organ in an important agricultural and model 
species. 
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Random-primer based genetic analysis of chickens 
divergently selected for humoral immune response 

T. O'Hare, T. Geng, P. Billam, E. Long, S. Guynn, E. Smith 

Department of Animal and Poultry Sciences, Virginia Tech, 

Blacksburg, VA (USA) 

Consumer and public health concerns about antibiotic use in chickens 
pose a significant challenge for the poultry industry. Recently, the popular 
press, led by the New York Times, has brought to the forefront the potential 
impact of antibiotic use in poultry feeds on the development of resistance by 
human pathogens (New York Times, Feb. 10, 2002). Natural immunity 
through genetic resistance, therefore, remains one of the few viable options 
open to the industry in meeting this challenge. Identification of DNA mark¬ 
ers for immune response will facilitate the use of genetic resistance by breed¬ 
ers in breeding chickens less susceptible to disease-causing pathogens. We 
conducted a random-primer PCR-based genome-wide scan of chicken lines 
divergently selected for immune response to sheep red blood cells. Using 
DNA templates from high- and low-line birds in the 27 th generation of selec¬ 
tion for immune response, we have screened a total of 28 primers for their 
informativeness in the selected lines. Five primers revealed line-specific frag¬ 
ments that were cloned, sequenced and used to conduct SNP analysis and 
validate association with immune response. These markers were also map¬ 
ped using the recently developed chicken radiation hybrid panel. Our efforts 
represent a complementary approach to ongoing efforts for QTL mapping of 
immune response genes using microsatellite markers. 


Nate Fechheimer Lecture 

New data on the effects of the rob(1;29) centric fusion in 

dairy cattle 

J. Kneubiihler, H. Joerg, F. Menetrey, C. Hagger, 

G. Stranzinger 

Department of Animal Science, ETH Zurich, Zurich (Switzerland) 

614 daughters of a heterozygous carrier bull of the rob(l;29) centric 
fusion in the Brown Swiss breed were investigated. Both the cytogenetic 
make up (fusion status) as well as the microsatellite haplotypes (BMC2228 
(173/179) and BMS4015 (148/154) around the fusion site have been deter¬ 
mined in the daughters. 322 (52.4%) animals in reproductive age were car¬ 
riers of the fusion. Testing with six microsatellites conserved in this centro¬ 
mere region, the primary fusion ratio after in vitro fertilization in 93 preim¬ 
plantation embryos revealed 51 % carriers. This ratio did not deviate signifi¬ 
cantly from the expected 1:1 ratio (x 2 = 0.097, P> 0.05) indicating that this 
fusion did not impair the fertilization and in vitro development up to blasto¬ 
cysts. 

In the adult cows the type traits were not significantly different between 
the two groups. The most interesting fertility traits showed all small differ¬ 
ences. Fusion negative animals had a slight advantage in 12 out of 16 investi¬ 
gated characteristics, but only one trait showed a significant difference (NRR 
75 in first lactation cows, P= 0.046). In the first three lactations carriers had a 
higher milk yield, but milk content (fat and protein) was better in non¬ 
carriers, which was also reflected in the breeding values. Culling ratio and 
culling reasons did not differ between the two groups and were similar to the 
general Swiss population data. 

We have outlined that a bovine 1.715 satellite DNA probe revealed, that 
there are great differences between the hybridization signals on the different 
fusions like rob(14;20) and rob(l;29). rob(14;20) appeared like a double sig¬ 
nal and confirmed the cytogenetic state of a dicentric chromosome whereas 
in the rob(l;29) this signal was missing indicating a loss of heterochromatic 
material on both chromosomes. The function of the kinetochore must be 
different between the two types of fusion. 

Recommendations for the use of rob(l;29) carrier bulls in AI should be 
reconsidered. 


Avian telomere biology: a dynamic endgame 

S.E. Swanberg, M.E. Delany 

Department of Animal Science, Meyer Hall, University of California, 

Davis CA (USA) 

Although vertebrate telomeres are highly conserved, telomere dynamics 
and telomerase profiles vary among species. Gallus gallus domesticus, the 
domestic chicken, has a long history as a model organism in developmental 
biology as well as human vaccine research and production in addition to its 
important role in food production and agricultural research. However, 
knowledge of telomere dynamics in avian species is limited. The objective of 
the present study was to examine telomerase activity and telomere length 
profiles of transformed and non-transformed avian cells in vitro. Telomerase 
activity was assayed using the Telomeric Repeat Amplification Protocol. 
Mean terminal restriction fragment (TRF) length and total telomeric DNA 
were quantified by densitometry of Southern blots. Chicken embryo fibro¬ 
blasts (CEFs) derived from pooled or individual El 1 embryos showed little 
or no telomerase activity from the earliest passages through senescence. 
Unexpectedly, a single population of particularly long-lived CEFs also 
showed telomerase activity after over 300 days in culture. Twelve trans¬ 
formed avian lines examined showed telomerase activity. In six non-trans¬ 
formed CEF cultures derived from individual embryos of an inbred line, 
TRF profiles demonstrated notable variability. Additionally, in each of these 
cultures, increases as well as decreases in mean TRF length were observed 
over time with a net decrease in mean TRF observed at senescence in four of 
the cultures. In two cultures, a net increase in mean TRF was observed. In 
spite of the variability among mean TRF profiles, all six of these cultures 
demonstrated a dramatic loss of telomeric DNA over the lifetime of each 
culture. Cells with critically shortened telomeres are thought to enter a state 
characterized by reduced replicative potential. The elimination of clonal 
populations signaled by critically short telomeres from the replicating pool of 
cells would explain an increase in mean TRF length accompanied by an over¬ 
all loss of telomeric DNA. Telomere length profiles of several transformed 
cell types demonstrated little of the typical TRF smear suggesting these cells 
may possess a reduced amount of telomeric DNA. 


Aberrations in canine multicentric lymphomas detected 
with comparative genomic hybridization and a panel of 
single locus probes 

R. Thomas , 3 K.C. Smith , 3 E.A. Ostrander , 3 F. Galibert d , 

M. Breen 3 ' b 

a Animal Health Trust, Newmarket, Suffolk (UK); b College of 
Veterinary Medicine, NCSU, Raleigh, NC (USA); c Fred Hutchinson 
Cancer Research Center, Seattle, WA (USA); d UMR 6061 CNRS, 
Faculte de Medecine, Rennes (France) 

The clinical presentation, histology and biology of many canine cancers 
closely parallels those of human malignancies, and their extensive genome 
homology is well established. Comparative studies of related human and 
canine malignancies can make a significant contribution towards the under¬ 
standing of tumor development in both species, and to improving tools for 
diagnosis, prognosis and therapy. Our ongoing studies focus on the molecular 
cytogenetic evaluation of canine cancers, the characterization of non-random 
genomic abnormalities, and comparison with knowledge gained from more 
widely studied human counterparts. 

As an example, data will be presented on comparative genomic hybridi¬ 
zation analysis of 25 cases of canine malignant multicentric lymphoma. This 
represents the most frequent life-threatening cancer in dogs, comprising 
approximately 20% of all canine malignancies. Aberrations involved 32 of 
the 38 canine autosomes, with a maximum of 12 per case and a mean of 
three. Genomic gains were almost twice as common as losses. A subset of 
frequently encountered aberrations was detected and their identity con¬ 
firmed using a panel of canine chromosome-specific BAC probes we have 
developed for studies of this nature. Potential correlations with immunophe- 
notype and histological subtype have been observed. 
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We aim to develop this approach for cytogenetic subdivision of this het¬ 
erogeneous canine disease, and to correlate findings with clinical outcome. 
Comparisons may also be drawn with homologous aberrations observed in 
human lymphoma, suggesting that a related genetic aetiology may be 
involved. This will form the framework for more detailed comparative stud¬ 
ies. We are now developing higher resolution resources for canine CGH anal¬ 
ysis, in the form of a microarray comprising a genome-wide set of cytogeneti¬ 
cally ordered BAC clones (see abstract of Breen et al.). 


Update on the cattle genome mapping project 

James E. Womack 

Department of Veterinary Pathobiology, Texas A&M University, 
College Station, TX (USA) 

Cattle gene mapping has come a long way since the primitive synteny 
maps of the late 1980s. Two linkage maps replete with polymorphic microsa¬ 
tellites were generated in 1994 and both synteny groups and linkage groups 


were anchored to chromosomes by in situ hybridization to Q-banded cattle 
chromosomes. These advances precipitated the genetic mapping of a large 
number of traits, including quantitative traits, in the decade of the 90s. Com¬ 
parative mapping was facilitated by the development of Zoo-FISH techno¬ 
logies using human single chromosome paints and by the development of 
radiation hybrid panels and ESTs in the late 90s. The development of YAC 
and BAC libraries complemented these technologies and facilitated the dis¬ 
covery of mutations underlying a few of the mapped traits in the present 
decade. 

A call by the NIH for white papers to nominate new model organisms for 
whole genome sequencing at the NIHGR sequencing centers in 2002 
prompted the development of a white paper proposal to sequence the cattle 
genome. The nomination was enhanced by an ongoing BAC map consortium 
that promised to provide a complete physical map to underlie the sequencing 
effort. The bovine genome was given high priority, although the NIH has 
stipulated that at least half the cost should be borne by agricultural interests. 
An extensive international fund raising effort has resulted in a commitment 
of the required funding and we are optimistic that sequencing will begin in 
the fall of 2003. 
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